Open-Source Interactive Data Analytics with Imhotep

Thursday, Oct 23, 2014 by Indeed Engineering

We are excited to announce the open-source availability of Imhotep, the interactive data analytics platform that powers data-driven decision making at Indeed. When we test changes to our applications and services, whether to our user interface or our backend algorithms, we measure how those changes affect job seekers. We built Imhotep to allow our engineering and product organizations to focus on key metrics at scale.

Key features

The Imhotep platform and tools allow you to:

  • Perform fast, interactive, ad hoc queries and aggregate results for large datasets
  • Combine results from multiple time-series datasets
  • Build your own data tools for analysis, monitoring, reporting, and automated data processing on top of the Imhotep platform

At its core, Imhotep is a distributed inverted index on time-series data that runs across a cluster of servers. We’ve made it easy to set up an Imhotep cluster on Amazon Web Services (AWS). Once you’ve set up your cluster, you can upload your data and then interactively query that data using IQL, the Imhotep Query Language. The IQL web client enables you to answer all sorts of questions about your data, and iterate quickly on those questions to get to important insights.

For example, at Indeed, we use Imhotep to answer these and many more questions about how people around the world are using our job search engine:

  • How many unique job search queries were performed on a specific day in a specific country?
  • What are the top 50 queries in a specific country? How many times did job seekers click on a search result for each of those queries?
  • Which job titles have the highest click-through rate for the query “Architecture” in the US? Which titles have the lowest click-through rate?

Getting started with Imhotep

You can use our tools to configure your Imhotep cluster on AWS. These setup tools require that you have an AWS account, two S3 buckets for data storage, and your time-series data in TSV or CSV format for uploading into the system.

To learn more, read our Imhotep documentation. If you need help, you can ask questions in our Q&A forum for Imhotep.

To learn more about how we use Imhotep for analytics at Indeed, check out the video and slides of our tech talk from April 2014: Large-Scale Interactive Analytics with Imhotep. If you’re in Austin, join us for our upcoming Imhotep workshop on November 5, 2014.

Using Proctor for A/B Testing from a Non-Java Platform

Wednesday, Sep 3, 2014 by Parker Seidel

We’re excited to announce the open sourcing of proctor-pipet, a tool we created that allows you to deploy Proctor as a remote service. The proctor-pipet tool is a Java web application that exposes Proctor as a simple REST API accessible over HTTP. This means that you can do A/B testing in applications written in non-JVM languages like Python.

In addition to proctor-pipet, we have made available a Python package called django-proctor that makes it easy for your Django web app to use Proctor groups. We look forward to others implementing similar packages for their favorite web frameworks, such as Ruby on Rails or .NET MVC.

These packages are the result of some great work by one of our fantastic summer 2014 interns.

How it works

Your web application makes HTTP requests to Proctor through proctor-pipet. Proctor returns the group assignments, which your web app can then use to make decisions on the content it returns to the user’s browser.

data flow for proctor-pipet

Deploying Proctor remotely through proctor-pipet lets you take advantage of all the features of  the Proctor library:

  • Assign users to test groups
  • Use identifiers to map to different test types
  • Toggle features or implement gradual rollouts of new features
  • Make changes to test allocations independently of your code
  • Determine group membership based on rules that use arbitrary context variables (for example, to target mobile devices)

Proctor documentation is available here.

Download both tools on GitHub: proctor-pipet (https://github.com/indeedeng/proctor-pipet) and django-proctor (https://github.com/indeedeng/django-proctor). Both pages include documentation with examples to help you get started. If you have any questions, ask them in our Proctor Q&A forum.

Bug Bounty Program: Cash Rewards for Reported Vulnerabilities

Sunday, Aug 3, 2014 by Gregory Caswell

As part of Indeed’s focus on constantly improving how we help people get jobs, we are proud to announce the rollout of our bug bounty program. Through Bugcrowd, interested security professionals will now be able to disclose vulnerabilities and be rewarded for their efforts.

For every unique submission that leads to a code change, we will be paying between $50 and $1,500. The range is dependent on the type and severity of the vulnerability reported. To view everyone who has helped us so far, or just to see how you stack up against the competition, head on over to the Hall of Fame.

Full details on how you can help us improve our services (and get paid!) can be found here. Please keep in mind that attacks against the current user base are strictly prohibited, as are automated vulnerability scanners. Responsible pen testers should always minimize system degradation and impact against users.

Ready to get started? Sign up at Bugcrowd and join over 10,000 security researchers on the largest and most diverse security testing team in the world.

indeed    +  bugcrowd