20+ Gigabytes. There are over 10,000 Starlink satellites in orbit. Each one is published as an ephemeris file found through https://starlink.com/public-files/ephemerides/README.md.

Starlink is also approved for up to ~40,000, so eventually ~80GB.

Module 1’s project is to build a parser which can parse an ephemeris file into a meaningful data structure which can be used for future projects. But my thought is, why parse just one?

But before going on further, what the heck is an ephemeris? Well this is a new word and concept to me, and it’s pretty cool.

An ephemeris, from what I understand so far, is a “forecast” of sorts for where an object in located (x,y,z), and it’s velocity in each direction (Vx, Vy, Vz). This data is used to understand where an object is and where it will be every “step” or time interval. In Starlink’s case, it’s every 60 seconds.

An example of what one looks like is as follows.

created:2026-04-26 18:52:05 UTC
ephemeris_start:2026-04-26 18:37:42 UTC ephemeris_stop:2026-04-29 18:37:42 UTC step_size:60
ephemeris_source:blend
UVW
2026116183742.000 -1915.3513713932 6388.1053312065 1572.4944716892 -4.0864197187 -2.6773471182 5.8621175749
4.4974090629e-07 -3.7736004244e-07 7.7068175662e-07 -1.4540774044e-10 -5.6290069703e-11 1.2324178359e-06 7.9157696423e-10
-8.9979874793e-10 2.6191926091e-13 1.9039340119e-12 -4
...

From some planning comments in the C++ parser I’m writing. You can see how the entry is composed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
    /**
     * 
     * timestamp: 2026115100342.000 | YYYY DDD HHMMSS.sss
     * position: -1180.1537434667 6646.6792368759 -499.8923176295 | x y z | km
     * velocity: -4.6181676683 -0.3658921704 6.1230849038 | Vx Vy Vz | km/s
     * 
     * covariance-matrix-input 
     * NOTE: this needs to be unpacked into a fulll 6x6 matrix, only the upper right of the matrix is provided
     *          
     * 4.9529272549e-07 -3.8070394357e-07 7.5785149365e-07 -2.0956695395e-10 1.9229091390e-11 1.1718187001e-06 9.2611637446e-10
     * -9.1585099974e-10 -3.3495200581e-13 7.3311529102e-12 -4.6546576329e-10 4.0028940231e-10 3.2760977512e-13 -8.9320818832e-13
     * 6.0843867840e-13 -7.5580316602e-13 2.0288322327e-13 1.5750218736e-09 -4.6894423182e-16 8.6004590944e-16 5.5708243178e-12
     * 
     * 
     * uncertainty in [x y z vx vy vz] | [U V W Udot Vdot Wdot] | satellite is probably here ± uncertainty
     */

My Vision

I always think it’s important and critical to set a vision for what you want to build early on. It’s one thing building utilities and CLIs with potential, but having a vision is key to staying focused. Here’s what I have in mind.

There’s more than enough data here to build a model of the constellation at any given time (within the multi-day window of an ephemeris).

Once I have a model of constellation up and running (think a globe with objects in orbit around it, accurate to the data) I want to overlay it with a routing simulation.

Let me elaborate, given a point on the globe, i.e. a user terminal (dish) as a source, determine the best node (satellite) to route a packet(s) to. Find the shortest path through the constellation to a given ground station back to terrestrial internet backbone. For this project, we’ll treat the ground station as the final destination.

Here’s where I have some crazy ideas. Once the routing is complete, I want to be able to replay it and step through hop by hop. That would be cool.

From Here to There

For this phase of the project, my goal is to just get the data loaded into something useful for future iterations.

I also do not understand all the math involved to make the above possible. But that’s a work in progress. I’m currently working through a College Algebra and Trig Textbook (Blitzer 5e).

C++ is… interesting. I primarily come from a Python and Go background, you can say I’m going backwards. I started with really high-level languages like Python, then Go, and now C++. I do want to learn Rust as well, maybe I’ll do a version on the side in Rust as I go. We’ll see.

I also deal with a LOT of network programming, writing proxies for TCP or gRPC etc. These apps are mostly focused on concurrency, simple load balancing algorithms, etc. They are not really data intensive applications. I.e. all the array slicing, math-heavy algorithms are just not that common. It is nice to work on something different, like parsing text files, which I use to do in Perl years ago… Needless to say, this project stretches all the right brain muscles in all the right places.

I’ll probably hang out in Module 1 for a while until I build some good foundational knowledge and skills.

Being Polite

Back to that 20+ GB of data. I’m really itching to throw some concurrency at their API and see how quick I can download it. But another name for that is DOS attack! So I’m being polite. I have a little bash script which will download a file every second, it takes about 2+ hours to download everything and I’ll cache the site on my own home server. Best not to DOS attack a site who’s purpose is Space Safety! I run the script here and there, download a chunk of files, turn it off, and continue where I left off. I have a few hundred to work with for the time being.

To AI or not to AI…

AI is everywhere, but I have to remember that the reason why I can steer the AI into the right solution when working with networked based apps and Go is because I wrote them for years by hand (And Googling…).

I’m fairly certain I could ask AI to build this whole project and it would do a decent job, but did I learn anything? No.

My goal is to use AI as little as possible. I do use it to ask it about C++isms for example. Getting unstuck etc. But I’m always looking to learn something through using it, not by it doing the work for me. I think that’s the secret to learning - you just need to do it yourself.