Sauron

This will be a very brief overview of our 2024 HackUTD hackathon project. There were a couple of very interesting sponsor prompts, and I’m still a little sad we never got the chance to use Pinata. We went with PNC’s prompt, to build tooling for observability and data lifetimes. Of course, the word “observability” and “lifetimes” immediately caused us to think of Kafka and message queueing. So, for our project, we decided to build a platform to model and display the end to end life cycle of data cleanly and efficiently. To that end, with only one all-nighter pulled. We produced Sauron (a very clever name I came up with my self, thank you very much). Here is a link of a demo that due to time constraints we had to make before we could fully hook it up the Confluents instance that we were simulating our Extract, Transform, Load pipeline on. And here is the GitHub repo

Briefly, we used Terraform to spin up a Confluents instance, as well as several small python cli programs (which were containerized and orchestrated over k8) which acted as modular consumer-producers to simulate micro-services in our pipeline. We utilized some minimal (and very under documented) OpenTelemetry python instrumentation, written directly into the service, which hooked up to a Jaeger instance (also deployed on k8). Finally, we somehow (I was asleep) relayed the Jaeger data into our front end, which you could see in the video.

Overall it was a very good, although exhausting hackathon, and we produced a very interesting project. I learned a lot, and was definitely the most hands on experience that I’ve had with Kafka. A couple take aways:

Maybe python isn’t the best language for a hackathon. Go is synonymous with cloud native architecture, and it really felt like Jaeger and really the entire stack only tolerated python.
Tooling, as opposed to consumer software, is difficult to build partially because the standards and protocols which are used have “rougher edges”. This is very similar to my experience working in embedded spaces, where protocols designed to I/O with a non-technical user (e.g. USB, SCPI) are much easier to leverage than whatever is going on with QUADSPI.