INTRODUCTION
Data
science is a multi-disciplinary field that uses scientific methods
,processes ,algorithms and systems to extract knowledge and insights
from structured and unstructured data.
It
unifies the concepts of statistics,data analysis, machine learning
and their empirical methods.
HISTORY
OF DATA SCIENCE In an early usage, it
was used as a substitute for computer science by Peter Naur in
1960.He later introduced Datalogy .In 1974, he published CONCISE
SURVEY OF COMPUTER METHODS which freely used the term data science in
its survey of the contemporary data processing methods .The modern
concept of “Data Science” was introduced during the second
Japanese-French statistics symposium organised at the university of
Montpellier (France) in 1992.
In
November,1997, C.F Joff correlated data science as an equivalent as
statistics and honoured the founder of INDIAN STATISTICAL
INSTITUTE , Prashanta Chandra Mahalanobis. From 2001 to 2009 , the
researchers worked on the concept Data for science and technology
(CODATA ) enabling digital data collections, information and computer
science, database system software etc
In
2012 Harvard Bussiness data science as the most attractive job
The
IEEE Task Force launch an Advanced Analytics and this concept was
renamed as statistical Analysis and data mining.
UTILIZATION
At
its core, data is just information – names, dates, times, $$, etc.
Data scientists work with large collections of this information to
draw conclusions. For example, they might use financial data to
predict seasonality of revenue generation or use the events of
applications (like logins, clicks, or downloads) to detect security
threats or fraud. Data science typically deals with ‘Big Data,’
which is too large and complex to manage on a local computer. People
interact with and create data like this every day using smartphones
, buying houses , rating movies, and more. You can thank data
scientists (and the teams supporting them) for guiding you to your
favorite Netflix series and helping optimize your workouts.
- Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using the simple BI tools. Unlike data in the traditional systems which was mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured.
- This data is generated from different sources like financial logs, text files, multimedia forms, sensors, and instruments. Simple BI tools are not capable of processing this huge volume and variety of data. This is why we need more complex and advanced analytical tools and algorithms for processing, analysing and drawing meaningful insights out of it.
Let’s
have a look at the below info-graphic to see all the domains where
Data Science is creating its impression.
How
about if you could understand the precise requirements of your
customers from the existing data like the customer’s past browsing
history, purchase history, age and income. No doubt you had all this
data earlier too, but now with the vast amount and variety of data,
you can train models more effectively and recommend the product to
your customers with more precision. Wouldn’t it be amazing as it
will bring more business to your organisation?
-
Let’s take a different scenario to understand the role of Data Science in decision making. How about if your car had the intelligence to drive you home? The self-driving cars collect live data from sensors, including radars, cameras and lasers to create a map of its surroundings. Based on this data, it takes decisions like when to speed up, when to speed down, when to overtake, where to take a turn – making use of advanced machine learning algorithms.
-
Let’s see how Data Science can be used in predictive analytics. Let’s take weather forecasting as an example. Data from ships, aircrafts, radars, satellites can be collected and analyzed to build models. These models will not only forecast the weather but also help in predicting the occurrence of any natural calamities. It will help you to take appropriate measures beforehand and save many precious lives.
Let’s
have a look at various model planning tools.
-
R has a complete set of modelling capabilities and provides a good environment for building interpretive models.
-
SQL Analysis services can perform in-database analytics using common data mining functions and basic predictive models.
-
SAS/ACCESS can be used to access data from Hadoop and is used for creating repeatable and reusable model flow diagrams.
Although,
many tools are present in the market but R is the most commonly used
tool.
Now
that you have got insights into the nature of your data and have
decided the algorithms to be used. In the next stage, you
will apply the algorithm and build up a model.
CONCLUSION
In
the end, it won’t be wrong to say that the future belongs to the
Data Scientists. It is predicted that by the end of the year
2018, there will be a need of around one million Data Scientists.
More and more data will provide opportunities to drive key business
decisions. It is soon going to change the way we look at the world
deluged with data around us. Therefore, a Data Scientist should
be highly skilled and motivated to solve the most complex problems.
by
Tadasha Ghose
BSc in Data Science (1st Year)
Source : web
No comments:
Post a Comment