MakeMyTrip has deployed a large number of open source innovations. Starting from Apache Hadoop, Storm and Spark to advanced solutions such as Jenkins, OpenTSDB, Grafana and the ELK (Elasticsearch, Logstash and Kibana) stack, the company uses many community-based deployments.
CTO Sanjay Mohan believes that there is a growing need to focus more on open source because a vast part of the Web is now being open sourced. “Open source solutions have been tried and tested for scale and security by the top-notch companies globally and proven to work well without any vendor lock-in period,” he says.
DataShark project for the community
While adopting open source to scale existing offerings and solve emerging problems has been quite common for MakeMyTrip, Mohan and his team sketched out a plan last October to contribute back to the community too with their own solution. So they designed DataShark as an advanced framework for security and network event analytics. “We felt that the security and open source community would greatly benefit from this framework and hence released it as an open source project,” says Vikram Mehta, senior manager of information security, MakeMyTrip.
The DataShark framework is targeted at security researchers, Big Data analysts and operations teams looking to ingest data from sources such as the file system, Syslog and Kafka in a secure and easy way. Also, the framework provides experts with the ability to write custom map and machine language algorithms that can operate on ingested data.
Built on Apache Spark, DataShark uses the Python language to write custom use cases and power the framework model. It comes with two operation modes, namely, standalone executable and production. The standalone executable mode of the framework provides a one-shot analysis of static data, whereas the production mode provides a full-fledged production deployment with components such as event acquisition, event queuing, the core data engine and persistence layer that can all ingest data from the file system or HDFS.
“We were leveraging machine learning for Web anomaly detection and other use cases, when we found the requirement for a framework that would help security teams to quickly write their own use cases with minimal effort. That brought DataShark to life,” Mehta told Open Source For You.
Major open source solutions powering MakeMyTrip
- Apache Hadoop, Storm and Spark for Big Data analytics
- Guava from Google and CoreNLP from Stanford for machine learning
- Jenkins, OpenTSDB, Grafana, and the ELK stack for production deployment and monitoring pipeline
- Apache HTTPD, Tomcat, NGINX, Django and MySQL for production infrastructure
Challenges in building a community solution
Building DataShark for the community involved some challenges for the team. “Being in the security field, it was challenging to up-skill ourselves to Big Data systems and machine learning,” says Kunal Aggarwal, senior information security engineer, MakeMyTrip.
Aggarwal is among the four-member team, which is led by Mehta and also includes Security Analyst Avinash Jain and Senior Security Analyst Dhruv Kalaan alongside Mehta and Aggarwal himself, tasked with maintaining the DataShark repository on GitHub, and with updating the open source project with new upgrades. Despite being a small team, the members preferred to utilise the inhouse skillsets to make DataShark successful, before releasing it in the public. “Once we had successfully acquired the skillsets, the journey with DataShark has been a breeze,” says Aggarwal.
Moving on from proprietary solutions
Mehta’s team is actively developing strategies to move from the commercialised proprietary world to the fast-growing open source ecosystem. The MakeMyTrip website, which receives over 22 million visits in a month, has recently shifted from some Microsoft technologies to open source implementations. “At an optimal operating expenditure, we were able to exercise greater flexibility using open source components on our website,” Mehta states.
MakeMyTrip is also considering open source as a parameter to find the right talent. The company believes that it is important for young professionals to opt for open source skills. “Open source platforms nowadays have become mainstream in the consumer Internet space, and the new-age software professional has to consider exposure to these platforms,” Mohan told Open Source For You.
Security enhancements through open source
DataShark is one of the publicly released examples of how MakeMyTrip is using open source to secure the user experience. However, there are several other community-backed options it uses on the security front. “We leverage various open source deployments to manage security. These have greatly helped to enhance our security posture in the market,” says Mehta.
Open source: An efficient operating model
Like many other open source adopters, the IT experts at MakeMyTrip are highly satisfied with the open source success rate. “The use of open source software not only reflects well on the technical expertise of the organisation, but is also a positive and efficient operating model,” says Mohan.
The CTO concludes that MakeMyTrip will continue to leverage open source on a large scale in the future and contribute to the community wherever possible.