Machine Learning Basics for a Newbie
Machine Learning (ML) gives computers the ability to learn without being explicitly programmed. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, ML explores the study and construction of algorithms that can learn from and make predictions on data through building a model from sample inputs. It’s a really exciting & impactful phase in the ML journey.
Today, every time you go to a website, most likely there’s a Machine Learning algorithm behind the scenes, analysing the data and interactions, radically heightening your experience using ML. In this post we would be focusing on key areas of Machine Learning using some real cases. First, some intuition
In this post let’s discuss how you can predict the real estate property rates in India using ML.
Prices of real estate properties is critically linked with our economy. Accurately predicting real estate prices is not possible. However, prediction of real estate trends is a realistic prospect.
In Mumbai alone there are around 10,000 current listings of 350 areas or more at 99acres.com. This rich dataset should be sufficient to establish a regression model to accurately predict the real estate prices in Mumbai.
Using basic regression models and taking into account the geographical data, we can automate price prediction system for buyers of real estate properties, in order to find under / overpriced properties currently on the market. This can be useful for first time buyers with relatively little experience and suggest purchasing strategies for buying properties
“Regression is the study of predicting an output that is a real number based on values of a number of input features or attributes of entities under investigation.
Let’s say you have to determine whether a home is near prime south centres of Mumbai or suburbs. In Machine Learning terms, categorizing data points is a classification task. Geography has always been one of the primary factors for Mumbai’s astronomic real estate prices. As infrastructure development in Mumbai has always linear (or one-directional) from the South towards the Northern suburbs. Defining a ‘distance’ function would be a good way to distinguish the areas.
Based on the distance from the prime south city centres to the suburbs, you can argue that a home within a radius of 1 kilometre should be classified as one in prime south city.
“Classification is the problem of identifying to which category new observation belongs. A number of observations are already available with values of their various properties or features along with their correct categorization into one of the category from the known categories. The new observation is categorized based on its values for the features being analysed.
In case of real estate property rates cluster analysis can be used to identify groups of properties with similar neighbourhood. Sale price for a house in a specific neighbourhood, which had no sells, can be predicted based on house sells from similar neighbourhoods.
“Clustering is the task of grouping a set of objects in such a way that objects in the same group (cluster) are more similar in some sense or another to each other than to those in other clusters.
Conventional way of finding the right home is often a long, stressful, and tedious process. Applying recommender system makes it easier for buyers to find details and unique insights on properties they’re interested in – providing information to consumers that relevant to them.
“Recommenders recommend items to users based on characteristics of items, users and the common interests.
“Deep Learning helps computer learn complicated concepts by building them out of layers and layers of simple concepts. Sometimes these concepts being learnt are intuitively easy to a human however computationally complex.
Continuing with the above discussed example on predicting the real estate property rates, let us apply deep learning algorithms to quantify the same.
Real estate appraisal, which is a process of estimating the price for real estate properties, is crucial for both buys and sellers as the basis for negotiation as well as transaction. Conventionally, repeat sales model has been widely adopted to estimate real estate price.
Deep Learning algorithms can enable robust and accurate feature learning, which in turn produces the state-of-the-art performance on many computer vision related tasks, e.g. digit recognition, image classification, aesthetics estimation and scene recognition. These systems suggest that deep learning is very effective in learning robust features in a supervised or unsupervised fashion.
Applying this algorithm to real estate properties can help us estimate the real estate prices as well. You would want to know whether visual features, which is a reflection of a real estate property, can help estimate the real estate price. Intuitively, if visual features can characterize a property in a way similar to human beings, we should be able to quantify the house features using those visual responses. As real estate properties are closely related to the neighbourhood, one can also develop algorithms which can rely on 1) the neighbour information and 2) the attributes from pictures to estimate real estate property price.
1International Research Journal of Engineering and Technology, Volume: 04 Issue: 03 | Mar -2017