Writing an LLM from scratch, part 22 – training our LLM

writing-an-llm-from-scratch,-part-22-–-training-our-llm

Archives Categories Blogroll This post wraps up my notes on chapter 5 of Sebastian Raschka‘s book “Build a Large Language Model (from Scratch)“. Understanding cross entropy loss and perplexity were the hard bits for me in this chapter — the remaining 28 pages were more a case of plugging bits together and running the code, […]

IRS Open Sources its Fact Graph

irs-open-sources-its-fact-graph

Legal Disclaimer: Public Repository Access No Endorsement or Warranty The Internal Revenue Service (IRS) does not endorse, maintain, or guarantee the accuracy, completeness, or functionality of the code in this repository. The IRS assumes no responsibility or liability for any use of the code by external parties, including individuals, developers, or organizations. This includes—but is […]

ImapGoose

ImapGoose is a small program to keep local mailboxes in sync with an IMAP server. The wording “keep […] in sync” implies that it does so continuously, rather than a one-time sync. ImapGoose is designed as a daemon, monitoring both the IMAP server and the local filesystem, and immediately synchronising changes. When the IMAP server […]

How First Wap Tracks Phones Around the World

how-first-wap-tracks-phones-around-the-world

In the spring of 2024, Lighthouse found a vast archive of data on the deep web. It contained thousands of phone numbers and hundreds of thousands of locations from nearly every country in the world. The data came from a little-known surveillance company called First Wap. Headquartered in Jakarta but run by a group of […]

I Hate Acrobat

i-hate-acrobat

Why? Acrobat is intrusive, slow and non-customizable. Of course there are alternatives, specifically bad ones. FoxIt is slow and non-customizable. Chrome/Firefox kinda works as a PDF reader, but is lacking in the feature department. On linux there is (at least) one non-bad PDF reader. Zathura is amazing with the MuPDF backend. However it only works […]

Next Steps for the Caddy Project Maintainership

next-steps-for-the-caddy-project-maintainership

tldr: I won’t personally see all comments/issues/PRs anymore; maintainer team is being granted tag+release privileges; community will be more involved with leadership; increase current bus factor of 1; unblock the project where I am the bottleneck; help the project scale better. Caddy is now about 11 years old, and the project has changed a lot […]

Cheap DIY solar fence design

cheap-diy-solar-fence-design

A year ago I installed a 4 kilowatt solar fence. I’m revisiting it this Sun Day, to share the design, now that I have prooved it out. The solar fence and some other ground and pole mount solar panels, seen through leaves. Solar fencing manufacturers have some good simple designs, but it’s hard to buy […]

I am sorry, but everyone is getting syntax highlighting wrong

i-am-sorry,-but-everyone-is-getting-syntax-highlighting-wrong

Syntax highlighting is a tool. It can help you read code faster. Find things quicker. Orient yourself in a large file. Like any tool, it can be used correctly or incorrectly. Let’s see how to use syntax highlighting to help you work. Most color themes have a unique bright color for literally everything: one for […]

Things I’ve learned in my 7 Years Implementing AI

things-i’ve-learned-in-my-7-years-implementing-ai

Even though the impacts of LLMs have never been seen before, they feel familiar to earlier assumptions. For context: I wasn’t the “PhD scientist,” working on models. I was the guy who worked on productionizing their proof-of-concept code and turning it into something people could actually use. I worked in industries ranging from software/hardware automated […]