Zeebra Resource Solutions are a recruitment consultancy firm, with headquarters in Prague, Czech Republic. We are recognized as specialists and market leaders in the IT and Telco areas. Our clients are mostly international companies interested in establishing themselves in Central and Eastern Europe.

Big Data Engineer - Hadoop stack

Boy do we have something special for you this year! We are super excited about our new client and hope you will be too after reading this advert.

Our client is a Silicon Valley hi-tech sw engineering company, that works on an unprecedented scale - data, processing speed, computing power, infrastructure . Whatever angle you look at it - it is impressive, if not scary :) Forget about Big Data, they work with Huge Data and the proof is in the numbers below, so please keep scrolling down.

What they do
Our client's core product is a leading Predictive Alaytics Marketing Platform that learns. It is designed to go beyond 1:1 marketing by learning to predict what marketing actions to take with a particular person in a particular moment of time. Through artificial intelligence at big data scale, they optimize performance, awareness, and lift across channels for agencies and marketers.

The technology platform supports a serving system that handles over 200 Billion events every day (compared to 3-4 billion daily Google search queries), a reporting system that aggregates and analyzes terabytes of data in real-time, and a learning system that applies machine learning and artificial intelligence techniques over 150 Petabytes of data. These systems work in harmony to serve a right advertisement to the right user at the right time.

To support the global strategy of the organization it was decided to set up a SW Dev Center in Prague with an objective to attract top talent on the local and CEE market. In the initial phase of recruitment we are looking to build a Senior core team consisting mainly of the technical experts in the following domains:

  • Core Java / JVM internals SW Engineer
  • Data Infrastructure - Java, Scala, Hadoop, Hive, Pig,
  • Machine Learning / Artificial Intelligence Experts
  • Web Developers with strong JavaScript

Data Infrastructure is the backbone of the technology platform. It provides the infrastructure that drives the learning, reporting and serving systems. The platform stores tens of petabytes of data, enables data stream and batch processing of massive data sets, provides milliseconds latencies to hundreds of terabytes of user data, democratizes data by enabling query access to everyone in the company, and makes it easy to make sense out of Big Data. This remarkable feat has been purely achieved over a large private cloud (1000+ nodes) that uses the strength and robustness of various open source technologies from rich and vibrant Apache Hadoop ecosystem.

Examples of work done by Data Infrastructure Engineers:

Modeling Data Store - A store of events (backed by HBase) organized in an optimal manner for machine learning. The abstractions and rich data set enable several interesting applications to be quickly built and deployed at scale.
Modeling Framework - A machine learning framework built on Spark that enables Rocket Fuel to rapidly build models for everything from click prediction to age/gender predictions. This framework has been entirely built in-house.

What You’ll Need:
Experience in working with large scale / high-throughput / multi-tenant distributed systems using two or more of the above technologies
Fluent in at least one mainstream programming language (Java, Scala, Python preferred).
Experience building software in at least two languages in a production environment (JVM experience preferred)
Strong team player and self starter
Experience or interest as open source contributor

Bonus Skills:
Configuration management tools experience such as Puppet
Experience with Big Data ecosystem and tools like Hadoop, Hive, HBase, Spark etc.

Why apply:
Above all a pretty unique(for the local market at least) opportunity to work on a really exciting tech platform with some pretty smart folks (PhDs in Machine Learning and AI from Stanford and Cambridge). Some numbers to support the claims for the scale and complexity:

. 250 Billion bid requests per day (43 times the number of searches run on Google in a day)
. 2 Million requests per second at peak
. AI completes over 36 Billion decisions per second
.. for every bid, considering thousands of active campaigns and many multiple objectives
. for every bid, keeping to the SLA of under 100 millisecond per single request (70 milisec + latency)
. logs size: hundreds of TB per day
. physical data centers (core system)
.. 5000 servers
.. 72,000 processor cores
.. 240 teraflops compute power
.. 700,000 GB RAM
.. 150 Petabytes storage (two clusters of 80 and 70 PT)

Besides the fulfillment you'll get from working on all this cool tech - you will also be paid very well and enjoy a high level of autonomy and freedom - as that's what we'd expect from a Senior and mature professional.

if you are interested in learning more about this cool and exciting opportunity - please send me an email to Alex @ Zeebra Dot CZ.

Informace o pozici

ZEEBRA Resource Solutions, s.r.o.
Required education: University
Required languages: English (Advanced)
Listed in: IS/IT: Application and system development, Programmer
Employment form
Employment form: Full-time work
Contract duration
Contract duration: Permanent
Employment contract
Employment contract: employment contract, contract under Trade Certificate / Identification No.
Employer type: Recruitment agency

ZEEBRA Resource Solutions, s.r.o., Alex Pilnikov