top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Time Series


Everything that happens is related to time. Any action, any change, anything really, will inevitably take

place in an interval of time. Therefore, it is essential that we study events about time to identify the

patterns in which they occur; learn from the past to improve our future. The purpose of this research is to

apply our knowledge of a mathematical term called Time Series, which its sole existence is about data

that is gathered about time. To make use of Time Series data it has to be analyzed, and here comes the

importance of Time Series Analysis. With the patterns extracted from the Analysis, it’s then used in

forecasting, which is one of the many characteristics of the models of Time Series among many more.

Time Series is composed of four variations which are the trend, seasonal, cyclic, and irregular variations.

Each of their contribution to the understanding of the data. The history of the Time Series shows how

much it was needed as the prediction of weather was of great interest to cultured people like Aristotle,

who had developed ideas about the causes and sequences of the weather. Since then, Time Series and its

Analysis has expanded from just being about the weather to being applied in various applications such

as sales, finance, heart rate measurement, stock prices, and much more. Moreover, it has entered countless

fields and to name a few in the computer science fields there is machine learning, cybersecurity, the

internet of things (IoT), and data mining. In this research, more in-depth details are given about each of

these applications and fields regarding the use of Time Series. As great as Time Series is, there are cases

in which its use is not optimal, and sometimes even downright useless, which is mostly decided with the

Stationary and correlation attributes of the data thus needing to convert it to a suitable format. Graphs

play a vital role in visualizing the output of such Analysis, consequently, it takes various shapes and

forms according to the purpose of using it. Lastly, all this big data Analysis is based on theories. Math

equations that describe the relationships that tie all this together.


Introduction

Time Series is the study of data observations due to a period, so it is important for a lot of different

fields like machine learning, network, security, data mining, etc. In which we can predict future data after

understanding and analyzing the previous history data over time. On the other hand, data collected

irregularly or only once are not considered Time Series. An all-statistical Analysis of Time Series data is

collected from a real-life thing we are interested in, the data is conditioned so, it can be used to make

predictions of future values.

The simplest example of a Time Series that all of us come across on a day-to-day basis is the change in

the temperature through the day or week or month or year.

Definition of Time Series:

A Time Series is a set of Numerical Measurements of the same entity taken at equally spaced intervals

over time, Time Series data could be collected yearly, monthly, or daily.

The Reason for choosing Time Series:

Time Series Analysis is special as it helps organizations understand the underlying causes of trends or

systemic patterns over time. Using data visualizations, business users can see season trends and dig

deeper into why these trends occur. Companies can also use Time Series Analysis to predict the

likelihood of future events like upcoming trends in fashion and popular music albums.

The goal of using Time Series Data:

Our aim is to use our previously collected data to predict what will occur in the future, in all fields, such

as weather forecasting, or many machine learning application.


Time Series Analysis:

Time Series Analysis is a specific way of analyzing a sequence of data points collected over an interval

of time to identify the common patterns displayed by the data.

In Time Series Analysis, analysts record data points at consistent intervals over a set period rather than

just recording the data points intermittently or randomly.

However, this type of Analysis is not merely the act of collecting data over time.

Time Series Analysis typically requires a large number of data points to ensure consistency and

reliability. An extensive data set ensures you have a representative sample size and that Analysis can cut

through noisy data. It also ensures that any trends or patterns discovered are not outliers and can account

for seasonal variance. Additionally, Time Series data can be used for forecasting-predicting future data

based on historical data, there’s always the potential for correlation between variables in these charts

because data points are collected in adjacent periods.


Types of Time Series Data:

- Generally, Time Series data is classified into two types:

1. A Stock Series is a measure of certain attributes at a point in time and can be thought of as

“Stock takes”. For example, the monthly labor force survey is a stock measure because it takes

the stock of whether a person was employed in the reference week.

2. A Flow Series are series that are a measure of activity over a given period. For example, surveys

of retail trade activity. Manufacturing is also a flow measure because a certain amount is

produced each day, and then these amounts are summed to give a total value for production for a

given reporting period.

The main difference between a stock and a flow series is that a flow series can contain effects related to

the calendar (Trading Day Effects). Both types of series can still be seasonally adjusted using the same

seasonal adjustment process.

- In addition to the above classification, Time Series data could also be classified

into three types:

1. Univariate:

A univariate Time Series consists of sequential measurements of a single variable over time.

Consider a Time Series dataset that contains measurements of a person named mike, who has

certain features (variables), such as gender, high, weight, and pulse. If we collect

measurements of one of these variables, say mike’s weight, over time, we have a univariate

Time Series. Using these values of mike’s weight, we can build a model to predict hit the

future weight

2. Multivariate (Bivariate):

A multivariate Time Series is multiple related variables over time. For example, mike’s

height and weight and we know that there is a relationship between the two variables (weight

and height). In that case, we have a bivariate Time Series. And using these values of mike’s

weight and height, we can build a prediction model to determine his future weight or height.

3. Multiple (Pooled Data):

It contains measurements of multiple entities that are independent. Now, let’s build upon the

univariate example, by including measurements about mike’s neighbor’s, and Kate’s weight.

Suppose we know that the measurements of these individuals are independent of each other.

In that case, we can say that the dataset contains multiple univariate Time Series, and

predicting the weight of an individual would depend on his or her previous weight alone.

Time Series in Computer Science Fields:

1. Machine learning (ML):

The predictive models based on machine learning found wide implementations in time series

projects required by various businesses for facilitating the predictive distribution of time and

resources.

Its methods:

1.1 Recurrent neural network (RNN): RNNs are neural networks with memory that can be

used for predicting time-dependent targets. Recurrent neural networks can memorize the

preciously captured state of the input to decide for the future time step. Recently, lots of

variations have been introduced to adapt recurrent networks to a variety of domains.

1.2 Long short-term memory (LSTM):

special RNN cells were developed to find

the solution to the issue with gradients by

presenting several gates to help the model

decide on what information to mark as

significant and what information to

ignore. GRU is another type of gated

recurrent network.

2. Cyber security and network: computer attacks

interrupt day-to-day services and cause data

losses and network interruption. Time series

analyses are popular machine-learning methods

that help to quantitatively detect anomalies or

outliers in data, by either data fitting or

forecasting. Time series analysis helps thwart

compromises and keep information loss to a

minimum.

The following graph shows the attacks mitigated

on a routed platform.

3. Internet of things (IoT): IOT prediction can

play a key role to enable companies to plan and

operate more efficiently.

IoT-based temperature prediction is already

being used successfully by manufacturers in

collecting weather data via a sensor board to

store and analyze the same data for next-day

predictions.


Modeling

1. Model Characteristics:

Each model can make any of these characteristics.

1.1 Classifications: identifies and assigns categories to the data.

1.2 Curve fitting: plots the data along a curve to studying the relationships of variables within the

Data.

1.3 Descriptive analysis: identifies patterns in time series data, like trends, cycles, or seasonal

variations.

1.4 Explanative analysis: attempts to understand the data and the relationships within it. As well

as cause and effect.

1.5 Exploratory analysis: highlights the main characteristics of the time series data, usually in a

visual format.

1.6 Forecasting: predicts future data, this type is based on historical trends. use historical data as

a model for future data, predicting scenarios that could happen along future plot points.

Time series forecasting is the process of analyzing time series data using statistics and

modeling to make predictions and inform strategic decision-making. It’s not always an exact

prediction.

Organizations analyze data over consistent intervals. They can also use time series

forecasting to predict the likelihood of future events. Time series forecasting is part of

predictive analysis. It can show likely changes in the data, like seasonality or cyclic

behavior, which provides a better understanding of data variables and helps forecast better.

1.7 Intervention analysis: studies how an event can change the data.

1.8 Segmentation: splits the data into segments to show the underlying properties of the source

information

Note that you should clean your data, we would be able

to identify outliers in the data. perhaps instead of

looking at actual sales, it would make more sense to

plot the percentage difference between observations.

this is a technique that can help smooth out very noisy

data. In this case, by plotting the percentage difference

in sales from month to month we smoothed out much of

the data – except for the enormous spike in March 2017.

This is not necessarily a bad thing. however, without

performing these analytic steps we may have been

unaware that such a spike existed.



0 comments

Recent Posts

See All

Comments


bottom of page