Monday, September 3, 2018

Seven Types Of Regression Techniques

 Introduction

The correct tools and modifications are general algorithms that people learn about the prediction method. Because of the formula, analysts often think of them as being the only short form. Those who have contributed little to the idea that they are the most important of all types of analysis.
The fact is that there are numerous types that cannot be counted, understood. Each form has a unique and unique situation that is best suited to apply. In this article, I explained the most popular methods of the most common way in a simple way. Through this article, I hope people have an idea of the level of change topics, rather than using any system they are meeting and hoping they are right!

What is Regression Analysis?

The regression analysis is a technique designed to measure the relationship between a random and independent investigation (prediction). This technique is used to make predictions, create a series of hours and to find a relationship between the effects of light bulbs between variables. For example, the relationship between the driver and the number of motor vehicle crash drivers is well studied through a system.
Regression Analysis is an important tool for data processing and analysis. Here, we are improving the line/data line, such as the way differences between distance data on the line or line are reduced. We will explain in detail the following sections in detail.

Types of Regression

1. Linear Regression

It is one of the best-known techniques. It's usually emotions normally the first subjects that people choose while learning about predictions. In this technique, the dependency change is constant, the independent change can be consistent or integrated, and the type of regression is measured.

Regression devices create a link between the dependent option (Y) and one or more independent variables (X) using a straight line (also called regression lines).

It is called equation Y = a + b * X + e, where intersections, b is the line strip, and e is a wrong word. This comparison can be used to predict the value of the target change based on the predicted change.

The difference between simple linear regression and multiple linear regression is that multiple linear regression has (> 1) independent variables, while simple linear regression has only one independent variable. Now, the question is "How do we get the best adjustment line?" ....

2. Logistic Regression

Logistic tools are used to capture event events = A victory and event = a problem. We must use the toolkit for change when the dependency change is voluntary (0/1, True / False, Yes / No). 
Here is the value Y depends on 0 to 1 and can be described by the following equation.

    Speeches = p / (1-p) = logical event occurrence / possibility of events occurring

    Ln (odds) = ln (p / (1-p))

   Calculate (p) = ln (p / (1-p)) = b0 + b1X1 + b2X2 + b3X3 .... + bkXk

Above, p is the possibility of staying indoors. One question you should ask here is "Why should we use the same logical record?"

Since we are working on the combined distribution (modification), we have to choose a suitable link for this division. And, it's logit action. In the above box, the borders are selected to strengthen the possibility of comparing the value of the sample rather than the combined total error correction (as a normal repetition).

3. Polynomial Regression

A regression equation is a polynomial regression equation if the power of the independent variable is greater than 1. The following equation represents a polynomial equation:
                                                          y = a + b * x ^ 2


4. Stepwise Regression

This type of change is used when dealing with many independent variables. In this technique, an independent option is made to help direct the process, which is not involved in human activities.

These skills are achieved based on census values such as R-squared, t-stats and AIC meter to determine key trends. The step-by-step tool is based on the regression system by adding/removing sequences based on the specific requirements. 

Some of the most commonly used methods for the following backup methods:
1. The slow hardware runs slowly with two things. Add and remove the predators as required for each step.
2. The direct selection starts with the main predictor of the example and adds each variable to each step.
3. Subtraction begins with all predictions in this approach and eliminates the minimum change in each step.
4. The purpose of this technique is to empower the predicted power of the smallest predicted predictive predicament. It is one of the ways to control the large data size.

5. Ridge regression

Ridge Regression is a technique used when data suffer from multicollinearity (independent variables are highly correlated). In multicollinearity, although least squares (OLS) estimates are unbiased; their variances are large, which diverts the observed value from the true value. By adding a degree of bias to the regression estimates, the peak regression reduces the standard errors.
Above, we saw the equation for the linear regression. Remember? It can be represented as:
                                                    y = a + b * x
This equation also has an error term. The complete equation becomes:

y = a + b * x + e (error term), [the error term is the value needed to correct a prediction error between the observed value and the predicted value]

=> y = a + y = a + b1x1 + b2x2 +.... + e, for multiple independent variables.
In a linear equation, prediction errors can be broken down into two subcomponents. First, it is due to the bias second and the second is due to the variance. The prediction error can occur due to either or both of these components. Here, we will discuss the error caused due to the variation.

6. Lasso regression

Similarly, Root Regression, Lasso (Reducer and Operator Optional Operator) will also compensate for the full measure of integrated multiplication. In addition, it is able to reduce the change and improve the accuracy of regression models. 

Look at the following examples: Lasso regression is different from the defect method of using the value that works for the whole sentence, rather than through the stadiums. This will result in a fine (or rational amount to limit the sum of the estimated cost estimate) that defines a certain amount of quantitative estimates. Most penalties apply; most of the estimates are reduced to zero. This causes selections of variables as variables.

7. Elastic Net Regression

Elastic Net is a link between Lasso and Root Regression techniques. It was trained L1 and L2 before a prosecutor. Elastic-net is useful when there are many interconnected jobs. It is likely that Lasso chooses one of these indiscriminately, as elastic-net may choose both.

The advantage of Lasso and Ridge compensation is to allow Elastic-Net to inherit a portion of Ridge stabilization through a circular motion.

 How to select the right regression model?

Life is usually easy, when you know one or two techniques. One of the training institutes I know tells students: if the result is consistent, ask for the correct answer. If it's neutral, use a mobile activation! However, the more options available, the more difficult they will have to choose. A similar case like the emotional model.

It includes many types of implications, it is important to choose the best technique depending on the type of independent variable dependent, the dimensionality of the data and other key features of the data. Here are the key points you need to choose to  the correct way:

Data exploration is an integral part of the construction of predictive approaches. Must take a very first step before selecting the right model, to identify the relationships and effects of variables

By comparing the variety of fitness patterns, I can put different tubes and borders on the borders, R-edges, fix R-square, AIC, BIC and remote fibres. Another is Cap’s requirements in Mallow. This really confirms the logical tricks in your style, compared to the structure of all possible (or selective options).

Fact-finding is the best way to evaluate predictions for predictions. Here it divides the data into two groups (train and verification). A simple square average difference between the price and predictive value provides a predictive accuracy of predictions.
If your data has many complexities, you should not choose the appropriate option because you do not want to put the same way.

It also depends on your goal. Maybe a small power model is easy to carry compared with a large number of censuses. A regular system of the routing system (Lasso, Ridge and ElasticNet) work well if there are a very high status and multiple variants between variables in the data.

End of the article

Now, I hope you have a general view of the problem. These recycling techniques should be applied in line with the requirements of the data. One of the best ways to find out the technology is to ensure the family portability, that is, integrated or permanent.

In this article, we have discussed seven types of interactions and key facts about each technique. As a new person in this plant, I recommend you learn these techniques and implement your model.





2 comments:

Merits & Demerits of Data Analytics

Definition:  The data analysis process was concluded with the conclusions and/or data obtained from the data analysis. Analysis data show...