Monday, September 30, 2019

CURRENT JOB OPENINGS IN DATA SCIENCE



post created by : 
1. Amit Kumar Auddy(BSC. IT-Big Data Analytics)
2. Aryadev Banerjee(BSC. IT-Data Science)

Data Science Job Trends

Data Scientist – With the surge in data and its correlated fields, the job of a Data Scientist has become the most sought after job. Many IT professionals and academicians who have worked in quantitative fields want to become data scientists.
There will be a sharp increase in demand for data scientists by 2020. According to IBM, an increment by 364,000 to 2,720,000 openings will be generated in the year 2020. This demand will only grow further to an astonishing 700,000 openings.

The requirement for the number of data scientists is growing at an exponential rate. This is resulted by the emergence of newer job roles and industries. This is supplemented by the increase in data and its various types. The number of roles and data scientists will only increase in the future. Some of the positions in data science such as data engineer, data science manager & big data architect. Moreover, the financial and insurance industries are becoming major players for recruiting data scientists.


Data Science over the next few Decades

Data Science is predicted to grow over the next decade. It is a staggering fact that over 90% of the data in the world was generated in just 2 years. It is unimaginable to realize the amount of data that will be generated in the next decade. The demand for data scientists will rise by 28% by 2020 alone. More and more industries are becoming data hungry and they need data to hold specialized data scientists who can craft products for the customers. About 11.5 Million jobs will be created by 2026 according to U.S. Bureau of Labor Statistics.



Data Science is rather an unrefined and crude term. It is a general term that has several definitions. However, with the passage of time, the data science roles will become more concretized. There will be a concise definition that will be imparted to data science that will enable the data scientists to handle corresponding operations. Deeper career paths will be developed in data science. Furthermore, a cleaner set of rules and regulations will differentiate pure data scientists from others.

There will also be a diversification of roles in data science over the next decade. Currently data science is an uncharted territory that is often misrepresented by various industries. The job role of a data scientist, therefore, suffers from this lack of representation. This is because currently both SQL operators and Data Scientists come under the same general definition of data science. This problem will fade away as industries will create job roles that are specified only to the specific data roles.

Rising Demand for Data Scientists-

The rise in demand for data scientists will prompt educational institutes to include it in their curriculum. The data literacy will increase in future and a data scientist will have a specialized holding, pretty much like a doctor or a lawyer. That is, he/she will be part of an entirely new discipline in itself. Recently, many universities have released their data science degrees that will bridge the skill-gap in the industries.

Since the field of data science in itself is young, data scientists do not hold years of experience behind them, as compared to other IT related field. In the next decade, data scientists will see a much greater distinction between Senior Data Scientists and other positions.



Top Data Science Jobs
1. Data Scientist


Data Scientists are analytical experts who are responsible for finding insights and patterns in the data. A Data Scientist is responsible for handling raw data, analyzing the data, implementing various statistical procedures, visualizing the data and generating insights from it. A Data Scientist is also responsible for handling both structured and unstructured information.A Data Scientist must have knowledge of various tools like Hadoop, R, Python, SAS, etc. Knowledge of data preprocessing, visualization and prediction are some of the important requirements of a Data Scientist.


2. Data Architect

A Data Architect is responsible for implementing the blueprints of a company’s data platform. This blueprint or architecture delineates various models, policies, rules that govern the storage of data as well as its use in the organizations. A Data Architect is responsible for organizing and managing data both at the macro level as well as the micro level.Some of the important tools used by a Data Architect are XML, Hive, SQL, Spark and Pig.
3. Data Engineer

A Data Engineer is responsible for building big data pipelines and models for the data scientists to work on. A Data Engineer must be well versed with both structured as well as unstructured data. A Data engineer is not only responsible for building data models but also maintaining, managing and testing it.Knowledge of database models and ETL are two of the most essential requirements for a Data Engineer. A Data Engineer is responsible for modeling large scale processing systems using tools like SQL, Hive, Pig, Python, Java, SPSS, SAS etc.


4. Data Science Manager

A Data Science Manager is responsible for handling and managing data science projects. A Data Science manager handles the team and manages the performance to meet project deadlines. Usually, data science managers have an average of five-year experience in any of the data science domain like date engineering, data science or analysis.

Data Science managers are responsible for planning and curating a roadmap for the data science team to follow. Furthermore, they are responsible for executing the plan of action and delivering the results before the deadline. He/She should also have strong communication and leadership skills in order to guide the team efficiently.


5. Statistician

A statistician is the oldest job title among all the roles. Before data science, statisticians were employed by the companies to use statistical modeling for understanding various trends in the market. A statistician is responsible for implementing A/B testing, harvesting data, describing data, developing inferential statistical tools and performing hypothesis testing.Some of the tools used by statisticians are R, SAS, SPSS, Matlab, Python, Stata, SQL etc.


6. Machine Learning Engineer

A Machine Learning Engineer is responsible for tailoring machine learning models for performing classification and regression tasks. A Machine Learning Engineer has the knowledge of various techniques like clustering, random forest and several other deep learning algorithms. It is an advanced field and people are required to possess analytical aptitude skills to develop machine learning algorithms.Some of the popular tools used by the machine learning engineers are TensorFlow, Keras, PyTorch, scikit-learn, Caffe etc.


7. Decision Scientists
The field of decision science is a relatively new field. Decision Scientists help the company to make business decisions with the help of tools like Artificial Intelligence and Machine Learning. It is a part of data science that extends to design thinking and behavioral sciences to better understand the clients.




What does Data Scientist do?

A data scientist mines complex data and deliver systems-related advice to his/her organization. In a team, they manage statistical data and look at what their company needs to create different models. A data scientist knows how to interpret data and extract meaning from it. Since data is rarely ever clean, s/he spends time collecting, cleaning, and munging it. S/he needs skills like statistics, machine learning, software engineering skills, persistence, and being human. That person also spends time in exploratory data analysis- in visualization and data sense. S/he will find patterns, build algorithms and models, design experiments, communicate with team members, and perform data-driven decision making.


Data Science Job Trends in India

Following are the trends in Data Science job postings per 1 million –

From the above chart, we can infer that Data Science jobs have been growing in trending charts. In 2019, it is expected that this trend will only grow higher.


Data Scientist Salary Based on Location

Major metropolitan cities like Bengaluru, Mumbai and Delhi have a high concentration of data science companies as well as thriving tech companies. Furthermore, various burgeoning startups in these cities are working on data science technologies. Therefore, there are ample opportunities to find work in these areas. Let us now have a look at how these cities are ranked based on the salaries offered.


Data Science Salary Based on Skills

Based on the skills of a data scientist, the salary also varies. Machine Learning is the most in-demand tool that the data science industry requires. As a matter of fact, it is the most sought out and highly paid skill in data science. Following are some of the key skills based on which salaries are offered to the data scientist.


Data Scientist Salary Based on the Industry

The salary of a data scientist is dependent on the industry of employment. Based on the industry, the data also varies. Therefore, the specialization of data scientist also differs across the industries. According to Linkedin, Data Scientists earn the highest salary in the area of Consumer Goods followed by Finance, Energy & Mining, Media & Communication and Corporate Services.
Data Scientist Salary based on Degree

According to a survey carried out by Linkedin, a Ph.D. degree can earn you the highest package in data science. This is followed by a Master’s Degree. Therefore, it is recommended that you have an advanced degree. Furthermore, the field of your study also plays some role in determining salary.




Tuesday, September 24, 2019

Brief introduction to Data Science

INTRODUCTION
Data science is a multi-disciplinary field that uses scientific methods ,processes ,algorithms and systems to extract knowledge and insights from structured and unstructured data.
It unifies the concepts of statistics,data analysis, machine learning and their empirical methods.
HISTORY OF DATA SCIENCE In an early usage, it was used as a substitute for computer science by Peter Naur in 1960.He later introduced Datalogy .In 1974, he published CONCISE SURVEY OF COMPUTER METHODS which freely used the term data science in its survey of the contemporary data processing methods .The modern concept of “Data Science” was introduced during the second Japanese-French statistics symposium organised at the university of Montpellier (France) in 1992.
In November,1997, C.F Joff correlated data science as an equivalent as statistics and honoured the founder of INDIAN STATISTICAL INSTITUTE , Prashanta Chandra Mahalanobis. From 2001 to 2009 , the researchers worked on the concept Data for science and technology (CODATA ) enabling digital data collections, information and computer science, database system software etc
In 2012 Harvard Bussiness data science as the most attractive job
The IEEE Task Force launch an Advanced Analytics and this concept was renamed as statistical Analysis and data mining.
UTILIZATION
At its core, data is just information – names, dates, times, $$, etc. Data scientists work with large collections of this information to draw conclusions. For example, they might use financial data to predict seasonality of revenue generation or use the events of applications (like logins, clicks, or downloads) to detect security threats or fraud. Data science typically deals with ‘Big Data,’ which is too large and complex to manage on a local computer. People interact with and create data like this every day using smartphones , buying houses , rating movies, and more. You can thank data scientists (and the teams supporting them) for guiding you to your favorite Netflix series and helping optimize your workouts.
Let’s Understand Why We Need Data Science.
  • Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using the simple BI tools. Unlike data in the traditional systems which was mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured.

  • This data is generated from different sources like financial logs, text files, multimedia forms, sensors, and instruments. Simple BI tools are not capable of processing this huge volume and variety of data. This is why we need more complex and advanced analytical tools and algorithms for processing, analysing and drawing meaningful insights out of it.

Let’s have a look at the below info-graphic to see all the domains where Data Science is creating its impression.

This is not the only reason why Data Science has become so popular. Let’s dig deeper and see how Data Science is being used in various domains
.
How about if you could understand the precise requirements of your customers from the existing data like the customer’s past browsing history, purchase history, age and income. No doubt you had all this data earlier too, but now with the vast amount and variety of data, you can train models more effectively and recommend the product to your customers with more precision. Wouldn’t it be amazing as it will bring more business to your organisation?
  • Let’s take a different scenario to understand the role of Data Science in decision making. How about if your car had the intelligence to drive you home? The self-driving cars collect live data from sensors, including radars, cameras and lasers to create a map of its surroundings. Based on this data, it takes decisions like when to speed up, when to speed down, when to overtake, where to take a turn – making use of advanced machine learning algorithms.
  • Let’s see how Data Science can be used in predictive analytics. Let’s take weather forecasting as an example. Data from ships, aircrafts, radars, satellites can be collected and analyzed to build models. These models will not only forecast the weather but also help in predicting the occurrence of any natural calamities. It will help you to take appropriate measures beforehand and save many precious lives.
Let’s have a look at various model planning tools.
  1. R has a complete set of modelling capabilities and provides a good environment for building interpretive models.
  2. SQL Analysis services can perform in-database analytics using common data mining functions and basic predictive models.
  3. SAS/ACCESS  can be used to access data from Hadoop and is used for creating repeatable and reusable model flow diagrams.
Although, many tools are present in the market but R is the most commonly used tool.
Now that you have got insights into the nature of your data and have decided the algorithms to be used. In the next stage, you will apply the algorithm and build up a model.
CONCLUSION
In the end, it won’t be wrong to say that the future belongs to the Data Scientists. It is predicted that by the end of the year 2018, there will be a need of around one million Data Scientists. More and more data will provide opportunities to drive key business decisions. It is soon going to change the way we look at the world deluged with data around us. Therefore, a Data Scientist should be highly skilled and motivated to solve the most complex problems.


by Tadasha Ghose
BSc in Data Science (1st Year)
Source : web

CURRENT JOB OPENINGS IN DATA SCIENCE

post created by :  1. Amit Kumar Auddy(BSC. IT-Big Data Analytics) 2. Aryadev Banerjee(BSC. IT-Data Science) Data Scie...