Evaluation of the Optimal Location for Erecting a Mexican Restaurant in the City of Madrid, Spain. - Khorgist.com

Breaking

Post Top Ad

Advertise Here

Post Top Ad

Monday, 24 June 2019

Evaluation of the Optimal Location for Erecting a Mexican Restaurant in the City of Madrid, Spain.

Evaluation of the Optimal Location for Erecting a Mexican Restaurant in the City of Madrid, Spain.



by
Molokwu Reginald Chukwuka



In partial fulfillment for the award of Specialization certificate & Professional badge in Data science from IBM




Date: June 24, 2019



Type: Peer Review



Table of Content

1.1. Problem Description.............................................................................................................. 3
1.2. Data Presentation................................................................................................................... 3
1.3. Target Audience..................................................................................................................... 3
3.1. Methodology.......................................................................................................................... 4
3.2. Results……............................................................................................................................ 9
3.3. Discussion.............................................................................................................................. 9
4. Conclusions.............................................................................................................................. 10






















1.1. Problem Description
In this project, the problem attempted to solve will be to find the best possible location or the most optimal, for a Mexican restaurant in the city of Madrid, Spain. To achieve this task, an analytical approach will be used, based on advanced machine learning techniques and data analysis, concretely clustering and perhaps some data visualization techniques.
During the process of analysis, several data transformations will be performed, in order the find the best possible data format for the machine learning model to ingest. Once the data is set up and prepared, a modeling process will be carried out, and this statistical analysis will provide the best possible places to locate the Mexican restaurant.


1.2. Data Presentation
The data that will be used to develop this project is based on two sites:
1. The Foursquare API: This data will be accesed via Python, and used to obtain the most common venues per neighborhood in the city of Madrid. This way, it is possible to have a taste of how the city's venues are distributed, what are the most common places for leisure, and in general, it will provide an idea of what people's likes are.

2. The Madrid City Hall's Web Portal: This site provides several data sources of great utility to solve this problem. The files are provided in Excel format, and they are built over a statstical exploitation and use basis. The data contains updated information about the inmigrant population per country and per nationality. This data will be analyzed in such a way that one could determine the best location of r anew venue/restaurant/other based on people's nationalities. For the sake of simplicity, it will be assumed for this exercise that people's likes varies according to their nationality, and that people from one specific country will be more attracted to place that matches the environment and culture of their own countries, rather than the ones from foreign countries.

You can access the data by clicking this link


1.3. Target Audience

The target audience of this project could be any business owner that is planning to open a new business local, restaurant, real state agency, shops, etc... Since this approach could be aplicable not only to mexican food restaurant but to other kind of businesses, anybody who is considering to place a new business local or even relocate it, could beneficiate of this project's approach.










2.1. Methodology
The methodology used to approach this problem includes some statistical exploration of the data and some visualizations. The main machine learning technique involved in the development of this project is clustering, in concrete the K-Means algorithm was used, implemented with Python.
At a first moment, the main problem was how to obtain the necessary data to build a constructive approach to the problem to be tackled. Usually, to solve these kinds of optimal business location problems, a lot of consumer’s data are needed, but for this example and for the sake of simplicity, the focus was put mainly on the population’s nationality. A study was carried out over the inhabitants of Madrid, and it was assumed for this example that the national population from a certain country would prefer restaurants based on their national country and food, rather than restaurants from other countries or that have nothing to do with the culture of their countries, specially when it comes to immigrant populations, that are not in their countries, and certainly would like to usually have a taste of their food and original culture. Because in the end, it is not only about the food, it is also about having a piece of the country in question. When a someone enters in an Italian restaurant, or American, or Peruvian restaurant, they are not only consuming the food and culinary specialties of the country in question, but also the culture, the people, the music, the decoration. All of this must make people feel like they were there on the country. With all this being considered, it was decided that the main goal to efficiently solve this problem, was firstly to define what our target population is, and secondly, find the areas where this population is living, and finally, examine the venues and restaurants in this area to see if our product could work.
Here is an example of the data used:
This data contains information about the quantities of immigrant populations in Madrid inside each Neighborhood. The main features are the country of precedence, which P á g i n a 4 | 10 indicates where the people of that lives in those neighborhoods come from. It contains also the quantities of people by country living in each neighborhood. So, with this, it is already possible to have an idea of where is our target population located. In this project, the idea is to open a Mexican restaurant in the city. With further analysis, this question will be answered. Nevertheless, this task could not be achieved only working with this raw data. It was also needed to obtain information about the most common venues in these neighborhoods, besides of the population kind that was inhabiting on the different neighborhoods. It was also needed to determine somehow in what measure these neighborhoods were different or similar between them. To continue this line, The Foursquare API was used to obtain the needed data about the venues in each neighborhood, but to use the Foursquare API, it was first necessary to transform the raw data to something the Foursquare API was capable to handle. Basically, the coordinates of each neighborhood were needed.
 This is an example of the transformed data:
Once the data was transformed into a format ingestible by the Foursquare API, the information about the venues could be obtained. The neighborhoods were then plotted into a map of Madrid, so it was possible to have an idea of their geographical situation:

The next step was to obtain the nearby venues by neighborhood, together with their respective coordinates:
Looking at this sample, it is possible to see the names of the venues, their coordinates, and the category of each venue. The results are ordered by neighborhood. This is a vital step in the segmentation process, since all the important data about the venues is obtained from here. Once the venues per neighborhood were obtained, it was then needed to look at the mean occurrence of each venue by neighborhood:
  
This process is progressive, once a piece of information is obtained, it is possible to go for the next one. With this data in hands, now the segmentation can be made, and the clusters created. But first it is necessary to determine somehow, what the appropriate number of clusters is. To perform this task, the elbow method was used. This method consists in plotting a hypothetical and usually large number of clusters in our data, and draw a curve representing the squared distances between each cluster. At some point, the distances will descend to a point where there is no need to keep increasing them. This means that creating more divisions in the data (clusters) is pointless as the difference between groups starts being highly difficult to appreciate:
This is our curve above. The distances start reducing importantly from cluster 5 on. So, it was determined that the optimal number of clusters for this problem was 5. With this being done, it is possibly to build the clusters now and have a look at them:
This are the 5 clusters on the map of Madrid, it is possible to see how many neighborhoods belong to each cluster, which is also important information. Now it is possible to examine the data of each cluster:

Cluster 1:

Cluster 2:

Cluster 3:
Cluster 4:


Cluster 5:

So, this kind of approach, allow us to perform an analysis of an entire city by looking at its venues and population. With this information, observations and conclusions can be made now.

3.2. Results
The results obtained were five clusters of very different population and venues distribution. The following is a description of the clusters:
• Cluster One: Occupied by Bulgarians and the most common venue is the seafood restaurant.
• Cluster Two:  Mostly inhabited by south Americans, Europeans, and North Americans. The most common venues are tapas restaurants, Argentinian restaurants, pizza places, supermarkets and Spanish restaurants, among many others.
• Cluster Three: This cluster is composed only by 3 different population kinds: Americans Ukrainian people and Dominican Republic people. The most common venues are Pizza place, gym, shopping mall, church and bakeries etc.
• Cluster Four: This cluster is only composed by Bangladeshi people. The most common places are Spanish restaurant, falafel restaurants, fish markets, fast food restaurants and electronic stores.
• Cluster Five: Again, only people from Ecuador seems to live in this cluster. The most common venues are soccer fields, burger joint, plaza, fast food restaurants.

3.3. Discussions
It is interesting how the venues and people from different countries varies to one cluster to another. The main differentiation is located on these two variables. Each cluster has its own characteristics, but also common spots with other clusters. If we examine with more detail these results, some conclusions can be made. As a recommendation, it must be said in a study of this size, to make good predictions about where to open a certain business or shop, more data is needed. For example, socio-demographic data about the population, like their income level, if they have children or not, the education level, what kind of job do they make a living from, etc.… Also, one of the most important data to examine carefully are the data related to the people’s likes and tastes about how they prefer to spend their leisure time, what kinds of food do they like, or what are their hobbies. With all these data gathered, a more indepth analysis could be performed, and the segmentations would be more accurate. For this project, these data weren’t available, and was also out of the project’s scope.


4. Conclusion
As far as we are able to see with this data, there are no mexican populations registered in Madrid. However, in Cluster 2, it is possible to notice that there's a mexican restaurant located in the "Centro" neighborhood, which is the town center.
If a deeper exam is performed into this cluster, it is noticeable that its the living population are mostly latinos, mixed with some other europeans, but mainly, the people living in this cluster come from south american countries. Apart of this fact, other kinds of latin restaurants can be found, like argetinian restaurants, tapas restaurants, and italian restaurants. So it is possible to tell that the inhabitants of this area likes these kinds of food.
By following this logic, if we would like to open a new mexican restaurant in the city or any kind of restaurant in fact, it would only be necessary to find a where are the restaurants similar the the one we want to open, study the population in that area, and find similar clusters of population in the city that don't have yet or have very few resturants like the one we would like to open.
In this example, clusters 4 and 5 could make a good match for our target population. Looking at the venues in these clusters, it is possible to find one mexican restaurant, and a good bunch of fast food, argentinian, and south american restaurants. So, in these clusters, it is possible to state that the existing restaurants matches the population's nationalities and tastes.
In conlussion, and taking into consideration the explanations given above as well as the data, it is highly possible that clusters 4 and five could be a good place to open our mexican restaurants. As explained above, the same logic could apply to oopen other kind of restaurant or business in any other area of the city. It is only necessary to to examine the the existing businesses in our target area, and study the population, then compare these 2 factors with the same ones in areas where there are existing business like the one we want to open, and then verify if the matching is correct


2 comments:

Read Comment Policy ▼
PLEASE NOTE:
We have Zero Tolerance to Spam. Chessy Comments and Comments with Links will be deleted immediately upon our review.

Post Top Ad


Do you ever witness news or have a story that should be featured on Khorgist ?
Submit your stories, pictures and videos to us now via WhatsApp: +2348123601750, Social Media @khorgist_com: Email: [email protected] More information here.