Introducing The Beep

The Beep - A newsletter that covers a broad range of topics from data technology to artificial intelligence.

Andreas

and

Alamhanz

Jan 01, 2024

Background

Numerous newsletters focus on discussions about Data Technology and Artificial Intelligence, yet the majority of them are primarily centered on disseminating news, excessive advertising, compiling articles, and overuse of buzzwords/jargon. The hype surrounding LLM and Image Generation underscores their capabilities, but only a limited few delve into the substantial aspects and provide insights on constructing such systems.

Our journey into newsletter creation commenced on November 30, 2023, in a Japanese restaurant in Jakarta. Both of us share a passion for reading and writing. Previously, we authored several technical articles on Medium; however, we encountered challenges in monetizing them due to a lack of geolocation support.

Our motivation to establish a high-quality newsletter stems from a desire to stay abreast of current trends and create a platform to express our curiosity in acquiring new knowledge. We both find joy in replicating, debugging, and modifying things on the internet, particularly in the realms of machine learning and deep learning. And so, we've launched a newsletter titled "The Beep" on Substack. As the name implies, our aspiration is that the articles will resonate with you, making your mind "beep" as you comprehend the relevant topics.

Topic

The Beep encompasses a broad spectrum of subjects ranging from data technology to artificial intelligence, and these topics evolve over time due to the significant advancements in the field. An illustrative example is the rapid development of ChatGPT, which was launched just last year (2022) and has progressed swiftly from its early version to its current state in 2023.

In the realm of data technology, the focus is primarily on how data is stored and processed, including technologies like Polars, VectorDB, and Data Platforms. However, the coverage doesn't delve deeply into the intricacies of the data engineering domain.

Artificial intelligence content spans from recent/simple machine learning algorithms yet powerful to sophisticated multimodal deep learning systems capable of hearing, reading, speaking, and writing.

Our three-month plan outlines upcoming coverage on significant topics such as large language models (LLM), VectorDB, and image generation. Each topic will feature diverse post types, ensuring that we not only address conceptual foundations but also provide hands-on insights and best practices. This approach allows our subscribers to actively engage and gain practical experience in the covered subjects.

Type of Post

We formulated a plan to classify each post into four distinct types:

Concepts
- These posts elucidate the workings of a method or object. Examples include introductory discussions like "How LLM Works in General," insights into the efficiency of Polars ("Why Polars is so fast?"), and explorations of concepts such as "What is Cross Attention?"
Tutorial
- This post type serves as a guide on utilizing various tools or methods. Examples encompass instructional content like "How to fine-tune your own BERT model" or "Guidelines on publishing a model card to the HuggingFace hub."
Best Practices
- Articles falling under this category offer insights for improved implementation of tools or methods. For instance, topics may cover "Strategies to enhance Pandas performance" or "Utilizing multiple GPUs for model training."
Awesome List
- These brief letters compile lists of popular items, like "Top 10 datasets for training LLM" or "Benchmarking text summarization task with 5 models."

Additionally, we plan to feature:

Interview Post
- This post involves conducting interviews with experts in the data technology and AI realm. We aim to gather insights into current trends, challenges, and strategies for overcoming obstacles.
Guest Post
- This article type involves inviting external experts to contribute content on various topics within our newsletter.

Subscription

The newsletter offers two subscription choices: free and paid. Free subscribers gain access to early posts, an awesome list, and a preview of all our content. On the other hand, paid subscribers enjoy full access to all posts and additional benefits, including tutorials complete with codes.

Regarding the release schedule, we've scheduled bi-weekly posts every Sunday and Thursday. As both of us contribute to the newsletter, we divide the writing responsibilities and focus on discussing and experimenting with topics we find important and engaging.

Author

Andreas Chandra is a data scientist for more than five years across many industries such as digital media, commerce, retail, fintech, and banking. He studied Informatics and began to learn machine learning and deep learning in the early of his career. Previously, Andreas wrote technical, application, and experience articles on Medium.

Alamsyah K Hanza is a data scientist with more than 7 years of experience in Data Analysis and Data Processing in tech companies. He has a background in Mathematics and taught himself as a data scientist after graduation through books, articles, and videos. He also has experience in building a data community with more than 5k members and helping many of them as a data practitioner.

The Beep