AMD MI300x GPUs with GEMM tuning improves throughput and latency by up to 7.2x

amd-mi300x-gpus-with-gemm-tuning-improves-throughput-and-latency-by-up-to-7.2x

Introduction: In Nscale’s latest technical deep dive, we explore a critical aspect of AI model optimisation: throughput benchmarking, performance tuning, and latency reduction using GEMM (General Matrix Multiplication) tuning. Maximising the performance of GPU-accelerated tasks involves more than just raw speed. Optimising GEMM ensures efficient processing, higher throughput, and the ability to handle complex models […]

The Flexipede Revisited

the-flexipede-revisited

The Flexipede Revisited Kate Sullivan, David Duce, Bob Hopgood On The Flexipede: Hugh ‘Ras’ Riddle: I didn’t see it as significant as I do now.Tony Pritchett: I didn’t think of it that way either. I thought – this is fun! 1. Introduction Tony Pritchett created The Flexipede in 1967, probably the first known character computer […]

How to waste bandwidth, battery power, and annoy sysadmins

Okay, let’s talk about something other than feed readers for a moment. How about completely broken web browsers? Yeah, those. This. This is a thing. Count the broken: ip – – [28/Jun/2024:14:44:26 -0700] “GET /w/2024/05/27/feed/ HTTP/1.1” 200 8052 ip – – [28/Jun/2024:14:44:26 -0700] “GET /w/2024/05/27/feed/ HTTP/1.1” 200 8052 ip – – [28/Jun/2024:14:44:26 -0700] “GET /w/2024/05/27/feed/ […]

Neo Geo Architecture: A practical analysis

neo-geo-architecture:-a-practical-analysis

Supporting imagery Model The Neo Geo ‘AES’.Released on 26/04/1990 in Japan and on 01/07/1991 America and Europe Motherboard MotherboardShowing revision ‘NEO-AES3-6’.I took this photo using the model I bought. Also, the previous owner re-arranged some capacitors to improve the video signal. Motherboard with important parts labelled Diagram Main architecture diagram A quick introduction Straight from […]

Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs

artificial-needles-to-real-haystacks:-improving-retrieval-capabilities-in-llms

Abstract:Recent studies have shown that Large Language Models (LLMs) struggle to accurately retrieve information and maintain reasoning capabilities when processing long-context inputs. To address these limitations, we propose a finetuning approach utilizing a carefully designed synthetic dataset comprising numerical key-value retrieval tasks. Our experiments on models like GPT-3.5 Turbo and Mistral 7B demonstrate that finetuning […]

The ‘Pay Phone Bandit’ Who Baffled the FBI in the ’80s

the-‘pay-phone-bandit’-who-baffled-the-fbi-in-the-’80s

Most of the sightings were the same. Standing in front of the motel clerk or convenience store worker was a man, roughly 5 feet, 9 inches tall, wearing a baseball cap pulled low and almost touching a pair of gold-rimmed eyeglasses. A ponytail stuck out from the back of the hat. A button-down shirt was […]

DevOps: The Funeral

devops:-the-funeral

Published on 2023-06-18 by GCH Devops is dead, they say. But the death of Devops is no more than the death of a word. A word – like agile or microservices – that is a tad too open for interpretation. Nobody owns it, therefore everybody can have opinions on it. So can I. Especially now […]

Hacking Amazon’s Eero 6 (part 1)

hacking-amazon’s-eero-6-(part-1)

This is the first in the series of hacking Amazon’s eero 6 (3rd generation) Wi-Fi device. In this post I will be focusing on device disassembly, identifying pins, brute forcing JTAG, and reading serial output. The second part of the blog can be found: https://markuta.com/eero-6-hacking-part-2/ About# Eero is a San Francisco-based wireless Internet company founded […]