Tuesday, September 24, 2019

Brief introduction to Data Science

INTRODUCTION
Data science is a multi-disciplinary field that uses scientific methods ,processes ,algorithms and systems to extract knowledge and insights from structured and unstructured data.
It unifies the concepts of statistics,data analysis, machine learning and their empirical methods.
HISTORY OF DATA SCIENCE In an early usage, it was used as a substitute for computer science by Peter Naur in 1960.He later introduced Datalogy .In 1974, he published CONCISE SURVEY OF COMPUTER METHODS which freely used the term data science in its survey of the contemporary data processing methods .The modern concept of “Data Science” was introduced during the second Japanese-French statistics symposium organised at the university of Montpellier (France) in 1992.
In November,1997, C.F Joff correlated data science as an equivalent as statistics and honoured the founder of INDIAN STATISTICAL INSTITUTE , Prashanta Chandra Mahalanobis. From 2001 to 2009 , the researchers worked on the concept Data for science and technology (CODATA ) enabling digital data collections, information and computer science, database system software etc
In 2012 Harvard Bussiness data science as the most attractive job
The IEEE Task Force launch an Advanced Analytics and this concept was renamed as statistical Analysis and data mining.
UTILIZATION
At its core, data is just information – names, dates, times, $$, etc. Data scientists work with large collections of this information to draw conclusions. For example, they might use financial data to predict seasonality of revenue generation or use the events of applications (like logins, clicks, or downloads) to detect security threats or fraud. Data science typically deals with ‘Big Data,’ which is too large and complex to manage on a local computer. People interact with and create data like this every day using smartphones , buying houses , rating movies, and more. You can thank data scientists (and the teams supporting them) for guiding you to your favorite Netflix series and helping optimize your workouts.
Let’s Understand Why We Need Data Science.
  • Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using the simple BI tools. Unlike data in the traditional systems which was mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured.

  • This data is generated from different sources like financial logs, text files, multimedia forms, sensors, and instruments. Simple BI tools are not capable of processing this huge volume and variety of data. This is why we need more complex and advanced analytical tools and algorithms for processing, analysing and drawing meaningful insights out of it.

Let’s have a look at the below info-graphic to see all the domains where Data Science is creating its impression.

This is not the only reason why Data Science has become so popular. Let’s dig deeper and see how Data Science is being used in various domains
.
How about if you could understand the precise requirements of your customers from the existing data like the customer’s past browsing history, purchase history, age and income. No doubt you had all this data earlier too, but now with the vast amount and variety of data, you can train models more effectively and recommend the product to your customers with more precision. Wouldn’t it be amazing as it will bring more business to your organisation?
  • Let’s take a different scenario to understand the role of Data Science in decision making. How about if your car had the intelligence to drive you home? The self-driving cars collect live data from sensors, including radars, cameras and lasers to create a map of its surroundings. Based on this data, it takes decisions like when to speed up, when to speed down, when to overtake, where to take a turn – making use of advanced machine learning algorithms.
  • Let’s see how Data Science can be used in predictive analytics. Let’s take weather forecasting as an example. Data from ships, aircrafts, radars, satellites can be collected and analyzed to build models. These models will not only forecast the weather but also help in predicting the occurrence of any natural calamities. It will help you to take appropriate measures beforehand and save many precious lives.
Let’s have a look at various model planning tools.
  1. R has a complete set of modelling capabilities and provides a good environment for building interpretive models.
  2. SQL Analysis services can perform in-database analytics using common data mining functions and basic predictive models.
  3. SAS/ACCESS  can be used to access data from Hadoop and is used for creating repeatable and reusable model flow diagrams.
Although, many tools are present in the market but R is the most commonly used tool.
Now that you have got insights into the nature of your data and have decided the algorithms to be used. In the next stage, you will apply the algorithm and build up a model.
CONCLUSION
In the end, it won’t be wrong to say that the future belongs to the Data Scientists. It is predicted that by the end of the year 2018, there will be a need of around one million Data Scientists. More and more data will provide opportunities to drive key business decisions. It is soon going to change the way we look at the world deluged with data around us. Therefore, a Data Scientist should be highly skilled and motivated to solve the most complex problems.


by Tadasha Ghose
BSc in Data Science (1st Year)
Source : web

No comments:

Post a Comment

CURRENT JOB OPENINGS IN DATA SCIENCE

post created by :  1. Amit Kumar Auddy(BSC. IT-Big Data Analytics) 2. Aryadev Banerjee(BSC. IT-Data Science) Data Scie...