New York
28 days ago
Senior Software Engineer - DataHub Search

Bloomberg’s data-driven products depend on fast, relevant, and secure access to petabytes of structured and unstructured data. The BBDS (Bloomberg Big Data Services) platform powers this scale with distributed systems built on Apache Kafka, MySql, Vitess, Apache Solr, and other cutting-edge technologies. We use clusters that index and serve millions of documents daily, making financial data easily discoverable across the firm.


Our Team

The DataHub Engineering team provides a distributed platform for hosting datasets, complete with managed data stores, search, discovery, lakehouse, and real-time stream processing capabilities. The platform offers a single place within Bloomberg to discover, access, publish, and subscribe to data.


You’ll join the team that introduced the abstraction of a “dataset”, invented a schema language to formally define all data at Bloomberg—complete with schema evolution, versioning, and true point-in-time semantics.


We’re the team that first brought Kafka, Avro, Dataset Schema Registry, Mesos, Clustered MySQL, Vitess, and Spark into the ecosystem to power a new data-intensive platform that is the hub for financial datasets.


The DataHub’s Search and Discovery Infrastructure, built on Apache Solr, powers the discoverability of those datasets, making Bloomberg’s financial data easy to search, index, and explore. Our systems serve millions of queries daily across hundreds of datasets, driving everything from analytics to real-time data products.


We'll trust you to:

Build tools and automation in Java or Python for indexing, reindexing, and performance tuning

Design and enhance indexing and query pipelines for performance, scalability, and reliability

Debug complex issues involving query latency, indexing pipelines, and distributed systems behavior

Collaborate with engineers across BBDS to enhance data discoverability, security, and scalability

Contribute upstream to open-source search technologies and improve internal frameworks for observability and resilience

Drive initiatives around Vector Indices and Hybrid Search capabilities

Apply performance engineering techniques using tools like eBPF to profile and optimize low-latency systems


You'll need to have:

4+ years of software development experience using Java

Deep systems knowledge of JVM internals, Java, Linux, Networking, and Distributed systems 

Familiarity with low-latency systems and performance tuning using eBPF or similar tools

A degree in Computer Science, Engineering, Mathematics, or equivalent practical experience


We'd love to see:

Experience with Python and/or Go

A passion for scalable, resilient, secure, and observable distributed systems

Expertise in Lucene, Apache Solr or Elasticsearch (indexing, sharding, scaling, query tuning) 

Salary Range = 160000 - 240000 USD Annually + Benefits + Bonus
The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.


We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.
Discover what makes Bloomberg unique - watch our podcast series for an inside look at our culture, values, and the people behind our success.
Confirm your E-mail: Send Email