In this post, we refer to data scientist positions within the industry, not academia, which follows different hiring pathways.
When looking for a job, you will likely have to go through a lot of job specs to select the ones you are both interested in and you can be a good fit for. This process may be overwhelming, especially if you are looking for your first role or are changing career. Moreover, the fact that data science is still a bit of an ill-defined field and suffers from a high level of promotional activity does not help. A lot of what is in here would probably well translate to tech fields other than data science (possibly even outside of tech) but there are peculiarities of data science that make it particularly troublesome; one of the many unfortunate effects of the stage data science is in right now is that many people do not know how to hire.
In order to identify the specs worth spending your time you need to remove those that are not. We will try to outline the outstanding features of bad specs in order to provide some guidance on when it is better to let go. The whole hiring process in data science is affected by poor job descriptions, I cultivate the (maybe a tad idealistic) idea that this will change in time, the more people will complain about them and the less candidates will accept them.
Chances are you might encounter a fair amount of job specs which thick yes to one (or more) of these things:
- Ask you to know many, many - too many - things at once: they are written like shopping lists of technologies you are supposed to be good at;
- Require a number of years of experience which is incommensurate to the scope of the role;
- Focus much more on tools than on the contribution of a data scientist as a problem solver;
- The writing is very much in self-praising mode when illustrating the company;
- Spend space to inform you about the personality of the hire they are looking for: someone who is “passionate”.
These are bad job specs. Note that I am directly excluding all those specs littered with typos, creative use of punctuation or that in any case look like they have been put down in a rush and have not even been proof-read for the basics - those do not deserve your attention.
Truth is, writing a good job spec is hard. You need to have a solid understanding of the role per se and its place in the tech landscape, as well as the job market, and you need to be able to correctly evaluate what level you need to hire for: do you need a starter, someone who may not have work experience but has all the skills desired, or do you require someone who has done (commercial) work already? Do you need someone fully focused on technical work or do you need a manager, someone whose job will be leading, whether a team or the strategy, or both? You also need to be able to delineate why there is a need for data science at the company and not, say, general analytics, or what problems are tackled that require sophistication like machine learning. You also need to have the basic writing skills to be able to produce a good job description. Finally, you need to sell it without making it clear that you are selling it.
Let’s go through the points above one by one.
Specs that want you to know too many things
This is a common find - specs that look like shopping lists. A typical example would contain bullet points along the lines of “proficient in Python, Azure, Scala, Hadoop, AWS, hypothesis testing, SQL, Github, deep learning…”“. Many times, these specs do not even separate their shopping list into categories: they put programming languages together with workflow tools, operating systems together with machine learning concepts, software together with statistics notions. It really looks like the author has pulled a few of the most recurring words coming up from a superficial Google search and placed them one alongside the other, rather than thinking about what exact skills are required for the role. This is a big red flag: while it is noble to give people the benefit of the doubt, you can assume that whoever wrote this does not have an understanding of the role itself and why the company is hiring for it. It could be the case that the person/team who wrote the spec is not the same you would eventually work with, and there might have been things lost in translation, but not spending time to review what is going out is not a sign of general attention to detail: hiring teams should spend time and effort crafting a good presentation card no matter how busy they are - it will go a long way in finding the best people.
Mentioning tools and concepts as literal bullet points does not manifest competence (note this stands for writing a CV too). Of course, when a company looks for a new hire, there will be programming languages, areas/subfields and tools the candidate is required to be knowledgeable about, this is because they will have to participate in specific projects or contribute to certain products. It will be then up to the company’s culture and workflows how open they are to edits and additions to their stack but some common ground is usually necessary (e.g. if most data work is normally carried out using R they might hire just people who have experience with it).
The problem arises when the list of job requirements is carelessly cobbled together as a bunch of words and with a flimsy connection to genuine needs. When specifics of a job are well listed in a spec instead, they are usually divided into meaningful areas of focus. An example would be something like:
- Programming: Python proficiency is a must, knowledge of R/Scala a plus;
- Deployment and implementation: experience in a cloud computing service is essential, we operate in AWS but will evaluate candidates which have worked with any cloud provider;
- Data Analysis: solid understanding of data cleansing techniques and statistical validation of results is required;
- Machine Learning: we are looking for people with research experience in state-of-the-art Computer Vision.
In short, you want to consider specs which show the company knows what they are talking about: what actual skills are needed and why. Remember, they might be overshooting and listing more things than they actually need: hiring is a pas de deux where each side is trying to sell themselves to the other and it is up to everyone to read between the lines of what is presented. There is a multitude of articles and blog posts advising job seekers to apply to a job even when not all requirements are met, precisely because of this. I agree with it, although with some caveats: you should understand what the company is looking to fill (and as we said the way they present requirements is key) and the macro knowledge requirements which are actually important, you should be able to detect where in the spec some lines are listed in order to attract certain types of profiles rather than others, without the necessary need for deep experience into something. All in all, remember, data science skills are vastly about critical thinking and the ability to learn efficiently and independently. Think about what toolbox you have at your end and how would it match what they look for: do you think you could cover for their expectations, and you could easily get up to speed with what you might be lacking? Or is it the case that what they look for is a profile more knowledgeable than you are for a specific core component?
Specs that require too many years of experience
The most popular way to assess experience in the job market is, generally speaking, by counting the years spent working. It makes sense: the more you have worked, the more experience you have and this does correlate with the things you know about the job as a whole and the industry - so you can bring more to the table. However, a bare reliance on this measure is in my view old-fashioned and could even backfire in the long run. The requirement for experience length is usually expressed at the very beginning of a spec, indicating it is an essential one. Now, if the position is a “senior” one, it goes without saying that having been there and done that in some form is necessary - we all need to cut our teeth on the first projects before becoming seasoned professionals. If it is a “managerial” role, the company would be looking for someone who has spent time managing (projects and/or people) before. The troubles arise when no seniority in the role is explicitly formulated but years of experience are given as a must: what are they based upon? Likely just on an intuitive assessment by the hiring team/manager plus a general feel gathered from what the competition asks in similar roles, so it is a self-alimenting illusion.
The trickiest job descriptions are those where the experience required is tied to a specific technology. Let’s leave aside those absurd extremes where the required experience specified exceeds what the technology itself has existed for: these cases (and, unfortunately, they are not necessarily rare outliers and may even come from well-known companies) should be discarded in the spirit of the first point above as evidently who wrote the spec did not perform their research very well. Asking for experience in a technology is a slippery slope anyway: would someone who has used Python for 10 years be better than someone who has only coded with it for a year? You cannot really tell by just that. If you have worked with Docker for the last 3 years does it mean you understand everything about containers? One can hardly tell. All this is obviously true for any job in tech, not just in data science, but with data science things it can get even itchier as there are technology and skills which are rather conceptual more than practical, and measuring competence in years of use can really be meaningless.
If someone is writing a spec for a data science position and there are some pieces of knowledge required by it, they should focus on assessing competence with means other than counting years of use. In fact, data science is all about measuring values and building quantitative and reliable information, right? So you should expect that the person writing a data job spec does a little data exercise in providing good information. When something like competence cannot be reliably measured and evaluated in numerical ways, it is way better to provide verbal qualifiers to inform about a general level desired:
- Proficiency in Python - this is not saying that 10 years of use are required, but that the hire should be able to write good, robust and tested code (which means they have not just learned the language);
- Familiarity with serverless architecture paradigms - this will not put the accent on how long you would have to have deployed jobs in a cloud system for, but rather informs you that you should understand the concepts and have played with them.
There are of course other methods to assess competence, especially when the area is conceptual. In much the same way academia does when screening candidates, roles which are heavily focused on research may require you to have publications and/or a track record of delivered projects. But for a general data scientist position where seniority is not a precondition, there should be no particular insistence on years of experience (note that it still makes sense to specify if the position is open to someone without experience at all as that is another situation). Unfortunately, this is often not the case. My suggestion is to still apply when you do not meet the experience length requirement (but meet other ones), but to do it with cognizance: present yourself by stressing how you think you would fit the bill, regardless of how long you have spent in work previously. If the hiring team is interested in actual knowledge, they will consider you regardless of actual experience length.
Specs that do not treat you like a problem solver
This one is probably the most common. A good data science job spec should always put most of the focus on what the hire is supposed to work on. Data science can be extremely broad in what it means in the workplace so that the actual work expected can vary a lot as a function of the product of the company, the team culture and composition, the stage of technical and scientific headway, and in many cases, the company’s size. Listing technical criteria for the hire is all good and well but if it is all there is in the spec, it is not a good presentation card: you as a job seeker should be given at least the gist of what the role encompasses. Sure enough, this will be covered in the interview if you get there, where you should be asking many questions other than just answering theirs, but a job spec is the door to that - a good one will give you an idea of what you can expect. Many do not do that. They treat you like a bundle of the same shopping list we discussed above rather than as a human with critical sense capabilities willing to put them at the company’s service to improve methodologies and generate wealth. A data scientist is a problem solver: their purpose is to tackle problems with quantitative information, and finding the most efficient solutions. This requires a lot of “human” skills that the sole technical background does not account for - a good data science job spec should put the accent on the intelligence you would bring in, not just the tools you can use.
Specs with a self-praising tone
On top of the buzzwords the data science space is filled with there are also buzzwords companies use when they want to promote themselves and appear good in the market, included in the context of attracting talent. While a bit of marketing of oneself is a good idea, an excessive use of self-praising terms is an irritating habit and is out of place in a job spec. If you see the continuous use of phrases like:
- “Cutting-edge”,
- “Award-winning”,
- “World-leading”,
- “Disruptive”, then it is probably one of those cases. I would not cut these specs straight out from your research, but I would pay higher attention to how the company generally describes themselves outside of the context of the spec (what presence do they have online, how does their website look, what language do they use in e.g., social media bios?). I would also suggest to go and have an active look at their company values (if they publish them publicly): how do they phrase them? Do they brag a lot there too?
Specs that try describing the personality of the new hire
It is not uncommon that on top of a description of the company there is also a description of the desired hire, in the part which takes care of the culture bit, which a good spec will contain. This bit is meant to outline the kind of workflows the company uses at the overarching level and furnish a bird’s-eye idea of the relationships within the team and between teams as well as how the hierarchy flows. The way this part is written can tell you a lot about the company itself, as well as the quality of the writing skills of who wrote the spec.
In much the same way as the point above on the self-praising description of the brand, this part can be full of banalities too, like when many adjectives are used for the ideal candidate that fall in the semantic area of “passionate” and “team-player” (who wants to hire someone who is not interested in the work or bullies others anyway?). The red flag arises when the whole discourse tries descending into tones that sound almost patronising or arrogant (they enlist things they do not want from candidates rather than phrasing the concept in the positive) - if this is the case, it is probably the worst of all in our points here.