A simple data science experiment with Azure Machine Learning Studio
Machine Learning is concerned with computer programs that automatically improve their performance through experience. It learns from previous experience or data. Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. (Wikipedia)
What is Machine Learning, data science and Azure Machine Learning Studio?
- Machine Learning is concerned with computer programs that automatically improve their performance through experience. It learns from previous experience or data.
- Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. (Wikipedia)
- Azure Machine Learning Studio is a tool that uses to develop predictive analytic solutions in the Microsoft Azure Cloud.
Azure Machine Learning Studio is an excellent tool to develop and host Machine Learning Application. You don’t need to write code. You can develop an experiment by drag and drop. Here we will create a simple Machine Learning experiment using Azure Machine Learning Studio.
Tools and Technology used
- Azure Machine Learning Studio
Now create our experiment step by step
Step 1: Create Azure Machine Learning Workspace
- Go to https://portal.azure.com and log in using your azure credential
- Click More Services from left panel of azure portal
- Click “Machine Learning Studio Workspace” under “Intelligence + Analytics” category
- Add a work space by clicking add (+) button at the top left corner
- Choose pricing tire and select. Figure shows pricing tire below.
- Finally click create button
Step 2: Launch Machine Learning Studio
- Click Launch Machine Learning Studio to launch machine learning studio
- Then login to the portal
Step 3: Create a blank experiment
- Select Experiment Menu. Then click New (+), at the bottom left corner.
- Click Blank Experiment. In addition to blank experiment there are many other sample experiments. You can load and modify the experiment.
- Once the new blank experiment has been loaded, you will then see the Azure ML Studio visual designer as follows.
Step 4: Add data set in the ML Studio visual designer
- You can import data set or can use saved data set. In this case we use saved sample dataset.
- Click Saved Datasets from left top corner.
- Drag and drop “Adult Census Income Binary Classification dataset” from Saved Datasets -> Sample
Step 5: Select columns in dataset
- Expand Data Transformation -> Manipulation
- Drag and drop “Select Columns in Dataset” to the visual surface
- Connect the “Dataset” with “Select Columns in Dataset” in visual surface
- Click the Select Columns in Dataset
- Click Launch column selector in the property pane
- Select “WITH RULES”
- Add age, education, marital-status, relationship, race, sex, income columns and finally click tick mark of the bottom right corner.
Step 6: Split up the dataset
- Split your input data into two – Training data and Validation data
- Expand “Data Transformation” -> “Sample and Split” from left pane
- Drag and drop Split Data to Azure Machine Learning Studio visual surface
- Connect the split module with “Select Columns in Dataset” in visual surface
- Click the Split module and set the value of the Fraction of Rows to 0.80 in the right pane of the visual designer surface. This means 80 percent data will be used for training and rest of the data will be used for validation.
Step 7: Train the model
- Expand “Machine Learning” -> “Train” from left pane
- Drag and drop “Train Model” to Azure ML Studio visual surface
- Connect split dataset1 to train model (second point of train model as figure below)
- Expand Machine Learning -> Initialize Model -> Classification from left pane
- Drag and drop “Two-Class Boosted Decision Tree” as shown figure
- Connect “Two-Class Boosted Decision Tree” to Train Model (first point of train model as figure below)
Step 8: Choose columns for prediction
- Click the Train Model
- Click “Launch column selector” in the property pane
- Select Include and add column name “Income”. Because this experiment will predict income.
- Click tick mark on the bottom right corner
Step 9: Score the model
- Expand “Machine Learning” -> “Score”
- Drag and drop “Score Model” to the visual design surface.
- Connect Train Model to Score Model (first point of Score Model as figure below)
- Connect “Split” to “Score Model” (second point of Split with Second point of Score Model as figure below)
Step 10: Evaluate the model
- Expand “Machine Learning” -> “Evaluate”
- Drag and drop “Evaluate Model” to the visual design surface.
- Connect “Score Model” to “Evaluate Model” (first point of Evaluate Model as figure below)
- Now click “Run” at the bottom of the Azure ML Studio. After processing, if you see each stage marked as green, means its ok.
- After completing process, right click on the Evaluate Model -> Evaluation Result -> Visualize
- You will see the accuracy curve as shown below.
- Click Save As at the bottom of the screen
Step 11: Setup a web service
- Click Setup Web Service -> Predictive Experiment
- Connect Web Service Input to Score model (As shown below figure)
- Select “Column in Dataset”, remove income column from dataset. Because model is now ready to predict income.
- Save and run the model from bottom of the ML studio
Step 12: Deploy Web Service
- Click Deploy Web Service -> Deploy Web Service [Classic] from the bottom of ML Studio
- After completing deployment process, you will see a dashboard. Here you will see different documents to test and consume services as shown below
- Click “Test Button” from the Dashboard
- You will see a popup dialog to take input
- Type input as like below and Click Tick mark
- You will see desired output as like figure. Here you see income > 50K
Now you have developed a simple data science experiment. You can now embed this with your application. API links, security key and necessary document is given in the dashboard.