In the developer series, Behind The Code, we reach out to the developers from the community to gain insights on how their journey started in data science, what are the tools and skills they use and what’s essential for their day-to-day operation. For this week’s column, Analytics India Magazine got in touch with Jacob Joseph, Lead Data Scientist at Clevertap.
How It All Started
Joseph, unlike many in this field, took an unusual route to enter the realm of data science. “I am not your conventional data scientist”, asserted Joseph when asked about his journey.
He comes from a commerce background and started his career as an Investment Banker.
In his decade-long career as an investment banker, Joseph had accumulated an immense experience through a myriad of roles he got to play. From overseeing mergers and acquisitions to raising funds as a venture capitalist, he had seen it all.
His foray into data science was a result of his exposure to quant-based style of investing, which involves the use of mathematical models to price the assets. He was fascinated by the overreaching capabilities of mathematical models, which gave him the much-needed impetus to make the giant leap from the domain of finance into the world of data analytics.
He had to start from ground zero. He picked up books, attended courses in programming, maths and stats while taking up freelancing projects and competing in hackathons in parallel.
“It wasn’t a smooth ride for me,” insists Joseph, when asked about his early days.
It took Joseph three years to land the top job as a Data Scientist at CleverTap and after four years, he now leads their data science team.
At CleverTap
Though the transition was not smooth, the experience gained throughout his decade long years as an investment banker did play a vital role in abstracting the problems faced by clients.
Joseph also emphasises on the significance of domain expertise in understanding possible pain points, which could then be quantified for possible improvements using data science methodologies.
When asked about the kind of tools and frameworks, Joseph says that it all depends on the problem statement, available data, implementation constraints and the expected solution. That said, he lists linear regression, logistic regression, clustering, SVM, and tree-based algorithms as his go-to options.
“I don’t tend to fall in love with tools,” quips Joseph when asked about the much-dreaded question of his favourite programming language.
He stands firm on the idea of having a preference for solving problems rather than getting too worked up about languages, tools and frameworks.
He tries to make the best of both worlds; be it R or Python, TensorFlow or PyTorch.
Making Of A Good Data Scientist
Here are 3 top practices according to Joseph that one would come across in their journey to becoming data scientists:
- Pre-Modelling Stage: Preparing, exploring and understanding data will take up the bulk of your time and is the most unglamorous task in analytics. There are no shortcuts. You have to keep at it if you are to excel in analytics.
- Story Telling: Another important but overlooked point is the storytelling. Most of the time, the consumer of your model or insights is a non-technical person. That consumer may be internal within your company or external. If you are not able to create a story around the insights you have discovered or impress the business about the benefits of utilising your model, your hard work will remain just on paper.
- Sponsor: Understand the user whom you are selling to. For a business where transparency matters over predictability, delivering black-box models is a strict no-no.
Joseph believes in playing to one’s strengths. For instance, if one has a background in programming, ML Engineer role would suit better. Whereas for a maths/stats major could ML Researcher type roles might be the right fit.
At CleverTap, Joseph and his team work on instilling confidence and trust in their customers as they believe that for their customers more than the prediction, the reason behind a prediction is more valuable. To address this challenge, they are shifting their focus towards employing Explainable AI approaches into their routine.
Some Wisdom For Beginners
Joseph preaches what he has practised. He had learnt about the rubrics of machine learning from books and online courses and he recommends the same to those who are starting out.
He highly recommends Elements of Statistical Learning by Trevor Hastie while also taking beginner level courses on Coursera/Edx on statistics, probability, linear algebra, calculus.
“Your understanding could be superficial without a solid background in fundamental.”
He also underlines the importance of putting the theory into practice. For this he lists few must-do things:
- Write articles on topics you have learnt. Explaining concepts in easy to understand is highly appreciated. It will not only help one learn the concepts in-depth but also shows others that you know the subject.
- Participate in hackathons where one can get exposure to real business problems and real business datasets. This will not only help you in gaining confidence to deal with real problems but also in gauging personal learning curve.
- Maintain Github with a good set of data science projects on Github. It is a very good place for prospective employers to assess your skills.
Joseph also advises against pivoting to AI and data science just because it is hot in the news. He asserts that one can still make a difference and grow within the current organisation by applying data science skills.
Future Direction
The future of AI is certainly bright and Joseph bets big on developments in quantum computing and causal models to aid in the next leap of AI.
For CleverTap, which is one of the top mobile platforms that use AI to personalise the customer experience using real-time behavioural data, there is always room for improvement and the team is looking at emerging trends in AI.
“We are doubling down on Explainable AI.”
As stated earlier, at CleverTap, the team is determined about incorporating Explainable AI into their workflows and Joseph reveals that he would be focussing on the development of causal models.
Explainable AI is one of the key emerging themes in AI and still in a nascent stage. This is what the data science team at CleverTap is doubling down on.
On a closing note, not adhering to the hype around AI, Joseph urges ML enthusiasts to keep their inner scepticism alive and not get swayed by the deluge of blogs, videos and other success stories. To this end, he recommends us to read the book, ‘Rebooting AI’ by Gary Marcus and Ernest Davis to help one understand the voids in current ML techniques.