How To Migrate Database To IBM DB2 On Cloud? – A view On What Can Be Migrated & What Cannot…


Database migration can look simple from outside, viz. get the source data, and import/load to the target database but devil is always in the detail and the route is not that simple. A database consists of more than just the data. A database can consist of many different, but often related, objects. With DB2 two types of objects can exist – system objects and data objects. Let’s see what are they and later on in the article some of the major objects are discussed from caution perspectives during planning and migrating.

Most of the major database engines offer the same set of major database object types: (please read additionally on these object types from respective vendors. The definition and function would be more or less similar. An analogy is that you drive cars, and when you move from one car to another the basic of car remains the same but differences lie on ignition button, windows, structure as a whole etc.. but the functional use and base of the car remains the same, like 4 wheels, enginess, chassis etc.)

  • Tables
  • Indexes
  • Sequences
  • Views
  • Synonyms
  • Indexes
  • Alias
  • Triggers
  • User-defined data types (UDTs),
  • User-defined functions (UDFs)
  • Stored procedures
  • Packages

System Objects include:

  • Storage groups
  • Table spaces
  • Buffer pools
  • System Catalog tables and views
  • Transaction log files

These objects at on-premise database should be given proper care while planning migrations. It is very important to understand what can be migrated and what cannot since there might be a need for professional services from 3rd party or from cloud vendor in doing so.

Let’s see what can be migrated and what cannot?

General SQL user-defined functions (UDFs) can be migrated but external UDFs might have some problem being migrated. External UDFs might be written in C, C++ or Java and then compiled in some cases to form a library and sit at specified location and would need to be registered with DB2. So, external UDFs need to be rebuilt on cloud servers because OS version might be different at target. Migrating such UDFs might need database migration services from cloud vendors or they cannot be migrated to cloud. Similarly, SQL stored procedures can be migrated to the target Database but external store procedures will carry the same constraints than that of UDF and will not be supported. Materialized query table(MQTs) can be migrated but they should be created after the data is moved to the target database. Similarly, the triggers can be migrated when data is moved to target database. The link between system-period temporal tables and their associated history tables must be broken before the table’s data can be moved (this holds true for bitemporal tables). A system-period temporal table is a table that maintains historical versions of its rows. Bitemporal Modeling is an information modeling technique designed to handle historical data along two different timelines. A bitemporal table is a table that combines the historical tracking of a system-period temporal table with the time-specific data storage capabilities of an application-period temporal table. Bitemporal tables are generally used to keep user-based period information as well as system-based historical information.

Now we have some idea on what to migrate and what not to. Database administrators should seek for some downtime while performing this. Proper planning and cautions should be taken while performing each of the discussed activity and proper time should be allotted understanding the nature of migration. Let me also point out a major constrain with migration to DBaaS or DB on instances on cloud (from System Object PoV): Only one buffer pool would be supported and the user spaces should be merged with main user space with one buffer pool to migrate to target state. Multiple user space with multiple DB pools and buffer pool will not be supported for DBaaS or DB on instance (VM) on cloud. So, remember that!

Now we would start the migration. There are certain tools developed from IBM to perform migration tasks and the important ones are db2look utility and IBM Optim high performance unload. Db2look utility is used for generating new Data Definition Language (DDL) statements for target DB2. IBM Optim high performance unload can perform copying the current Database to a temporary folder/bucket which can be AWS S3 or Softlayer Swift. The Same tool can be leveraged to paste through Import/load utility to the target.

The various ways to move data to DB2 on cloud – DB2 hosted, DB2 on cloud and DB2 warehouse on cloud are given below:

  • Load data from a local file stored on the desktop (Using #Bluemix interface)
  • Load data from a Softlayer swift object store (Using #Bluemix interface)
  • Load data from Amazon S3 (Using #Bluemix interface)
  • Use the DB2 data movement utilities, remotely
  • Use the IBM data transfer service (25 to 100TB)
  • Use the IBM Mass Data migration service (100 TB or more)

Now comes the security aspect while migrating, encryption using AES or 3DES is recommended. SSL and TLS are the preferred methods to secure data in transit.

Let’s also throw some light on DB2’s native encryption and how it works.

  • Client requests an SSL connection and lists its supported cipher suites (AES, 3DES)
  • Then server responds with a selected cipher suite and a copy of its digital certificate, which includes a public key.
  • Client checks the validity of certificate – if it is valid, a session key and a message authentication code (MAC) is encrypted with public key and sent back to the server.
  • Server decrypts the session key and MAC (message authentication code), then send an acknowledgment to start an encrypted session with the client
  • Server and client securely exchange data using the session key and MAC selected.

These are some of the important points to be considered while migrating on-premise database to IBM DB2 on Cloud.

Feel free to share your views.

#IBM #Database #Bluemix #db2look #DB2


All You Need To Know About AWS CloudFront

All you need to Know about AWS CloudFront.

CloudFront is a caching mechanism from AWS to support quality of service to the users spread across geographies without keeping all the data at each location all the time.

How to start using cloudFront:
1. configure orgin server
2. Upload files to orgin server
3. Create cloudFront distribution
4. Config distributed to edge location by CloudFront
5. Use domain name given by cloudfront in application

You configure your orgin servers, from which CloudFront gets your foles fro distribution from CloudFront edge locations all over the world.

An orgin sevrer stores the orginal, definitive version of your objects. If you’re serving content over HTTP, your origin server is either an Amazon S3 bucket of an HTTP server, such as a web server.

You create a cloudFront distribution, which tells cloudFront which orgin servers to get your files from when users request the files through your web site or application.

As you develop your website or application, you use the domain name that cloudFront provides for your URLs.

Alternate Domain Names (CNAMEs) (Optional)
Specify one or more domain names that you want to use for URLs for your objects instead of the domain name that CloudFront assigns when you create your distribution.

For example, if you want the URL for the object: /images/image.jpg

to look like this: jpg

instead of like this:

When user access the website and requests an object then DNS routes the request to edge location. Edge location then checks for the file if it is there are edge location and if not it gets from the orgin server (s3 or HTTP server)

The orgin servers send the files back to the cloudFront edge location.

As soon as the first byte arrives from the orgin, cloudFront begins to forward the files to the user. CloudFront also adds the files to the cache in the edge location for the next time someone requests those files.

After an object has been in an edge cache for 24hours or for the duration specified in your file headers, cloudFront does the following:
CloudFront forwards the next request for the object to your origin to determine whether the edge location has the latest version.

If the version in the edge location is the latest, Cloudfront delivers it to your user.

If the version in the edge location is not latest then the orgin server send the latest version to cloudFront, and CloudFront delivers the latest version to the user. It then holds the latest file and keep them until the the version remains the same.

CloudFront has use cases with both static and dynamic contents.

Pricing of CloudFront:
Data Transfer Out to Internet:
You will be charged for the volume of data transferring outside of cloudFront edge locations, measured in GBs. If there are other services as well from where the data is generated and rendered then you have to include cost of compute, storage, GET requests and data transfer out of the service as well. The cost is measured per geographical location for billing.

Data Transfer out to origin:
There will be charge /GB of data out from orgin location to edge locations. It is for both AWS orgin servers or your own orgin servers.

HTTP/HTTPS requests:
There will be charges for every HTTP/HTTPS requests to cloudFront.

Invalidation requests:
You can request upto 1000 paths each month from Amazon CloudFront at no additional charges. Beyond that there will be charges included in the billing.

Dedicated IP custom SSL:
You pay $600 per month for each custom SSL certificate associated with one or more cloudFront distributions using dedicated IP version of custom SSL certificate support.

When To Prefer MPLS over Site-to-site VPN? A Short Summary

A very Short writing on why MPLS over Site to site vpn or ILL….

My sincere apologies that it is not in detail…………..When To Prefer MPLS over Site-to-site VPN?

MPLS is an open source technology that stands for Multiprotocol Label Switching. It is data carrying technology for high performance networks.

Why MPLS is fast?

Site to site vpn works on tunneling mechanism has to undergo several steps: when the packet gets generated at layer1, goes to layer2, and layer3 and then tunnelling (layer4). While passing from one layer to another it takes time and thus latencies are caused. MPLS is fast because the processing of data happens at layer2 level and transmitted through layer3 thus lesser overhead and faster TTL. MPLS works on logical path rather than endpoints and help route the traffic through optimal path. It also does (2.5) labelling for certain type of traffic and that might get precedence over another path.


MPLS unlike site-to-site vpn is easy to configure and is user friendly. It is like a cloud where various end-users/sites get connected and thus it is free from site to site vpn’s tedious configurations.

ILL on the other hand serve the same function as MPLS but it will only be there as a connectivity and will not perform any labelling and thus unable to route from shorted time to live (TTL) path and thus the performance is not guaranteed.


In larger organizations, MPLS is “a must” to be quite certain about TTL and if not MPLS in some case “site to site” vpn can serve the purpose but site to site vpn will be an expensive solution and will demand huge administrative effort. ILL is suitable for small offices and branch offices but it is not an optimal and secure way of providing accessibility.

Secure Your OpenStack API with TLS

To secure your Openstack API with TLS, you should get a certificate that could either be signed or self-signed. Openstack API worker natively supports SSL/TLS. Apache httpd or ngnix can help if you are willing to use external authentication system, viz. kerberos, SAML or OpenID.

Let’s check, how can you achieve it……..

Let’s assume you have well setup apache in place, you would need 3 virtual host.

  1. The first virtualhost will respond on port 80 (HTTP) to redirect all users to port 443 (HTTPS). 

The following code will assist to force the use of HTTPS….

<VirtualHost <ip address>:80>
ServerName <site FQDN>
RedirectPermanent /https://<site FQDN>/

now you can provide machine IP with FQDN.

<VirtualHost 192.168.100.XX:80>
ServerName api.test.local
RedirectPermanent /https://api.test.local/

2. The second section involves setting up HTTPS VirtualHost and uses the following templates:

<VirtualHost <ip address>:443>
ServerName <site FQDN>
SSLEngine On
SSLProtocol +TLSv1 +TLSv1.1 +TLSv1.2
SSLCipherSuite HIGH: !RC4: !MD5: !aNULL: !EXP: !LOW: !MEDIUM
SSLCACertificateFile /path/<site FQDN>.crt
SSLCertificateFile /path/<site FQDN>.crt
SSLCertificateKeyFile /path/<site FQDN>.key
WSGIScriptAlias / <WSGI script location>
WSGIDaemonProcess horizon user=<user> group=<group> processes=3 thread=10
Alias /static <static files location>
<Directory <WSGI dir>

#In Apache http server 2.4 and later:
Require all granted
#For http server 2.2 and earlier:
#Order allow, deny
#Allow from all

3. The third section is to secure port 8447, where the API runs……

<virtualHost <ip address>:8447>
ServerName <site FQDN>
SSLEngine On
SSLProtocol +TLSv1 +TLSv1.1 +TLSv1.2
SSLCipherSuite HIGH: !RC4: !MD5: !aNULL: !eNULL: !EXP: !LOW: !MEDIUM
SSLCACertificateFile /path/<site FQDN>.crt
SSLCertificateFile /path/<site FQDN>.crt
SSLCertificateKeyFile /path/<site FQDN>.key
WSGIScriptAlias / <WSGI script location>
WSGIDaemonProcess horizon user=<user> group=<group> processes=3 thread=10
<Directory <WSGI dir>>

Second one is similar to previous section with the only difference in port number…

Restart httpd….everything will be encrypted…..

In case if there is ngnix…

Server {
listen : ssl;
ssl_certificate /path/<site FQDN>.crt;
ssl_certificate_key /path/<site FQDN>.key;
ssl_protocols TLSv1.1 TLSv1.2;
ssl_cipher HIGH: !RC4: !MD5: !aNULL: !eNULL: !EXP: !LOW: !MEDIUM
server_name <site FQDN>;
keepalive_timeout 5;
location / {




Before understanding what REST can provide in architecture, this is a good time to discuss REST in more detail. Dr. Roy Fielding, the creator of the architectural approach
called REST, looked at how the Internet, a highly distributed network of independent resources, worked collectively with no knowledge of any resource located on any server. Fielding applied those same concepts to REST by declaring the following four major constraints.
1. Separation of resource from representation. Resources and representations must be loosely coupled. For example, a resource may be a data store or a chunk of code, while the representation might be an XML or JSON result set or an HTML page.
2. Manipulation of resources by representations. A representation of a resource with any metadata attached provides sufficient information to modify or delete the resource on the server, provided the client has permission to do so.
3. Self-descriptive messages. Each message provides enough information to describe how to process the message. For example, the “Accept application/xml” command tells the parser to expect XML as the format of the message.
4. Hypermedia as the engine of application state (HATEOAS). The client interacts with applications only through hypermedia (e.g., hyperlinks). The representations reflect the current state of the hypermedia applications.

Let’s look at these constraints one at a time. By separating the resource from its representation, we can scale the different components of a service independently. For example, if the resource is a photo, a video, or some other file, it may be distributed across a content delivery network (CDN), which replicates data across a high-performance distributed network for speed and reliability. The representation of that resource may be an XML message or an HTML page that tells the application what resource to retrieve. The HTML pages may be executed on a web server farm across many servers in multiple zones in Amazon’s public cloud—Amazon Web Services (AWS)—even though the resource (let’s say it is a video) is hosted by a third-party content delivery network (CDN) vendor like AT&T. This arrangement would not be possible if both the resource and the representation did not adhere to the constraint.

The next constraint, manipulation of resources by representations, basically says that resource data (let’s say it is a customer row in a MySQL table) can only be modified or deleted on the database server if the client sending the representation (let’s say it is an XML file) has enough information (PUT, POST, DELETE) and has permission to do so (meaning that the user specified in the XML message has the appropriate Architecting the Cloud: Design Decisions for Cloud Computing Service Models (SaaS, PaaS, and IaaS) database permissions). Another way to say that is the representation should have everything it needs to request a change to a resource provider assuming the requester has the appropriate credentials.

The third constraint simply says that the messages must contain information that describes how to parse the data. For example, Twitter has an extensive library of APIs that are free for the public to use. Since the end users are unknown entities to the architects at Twitter, they have to support many different ways for users to retrieve data. They support both XML and JSON as output formats for their services. Consumers of their services must describe in their requests which format their incoming messages are in so that Twitter knows which parser to use to read the incoming messages. Without this constraint, Twitter would have to write a new version of each service for every different format that its users might request. With this constraint in place, Twitter can simply add parsers as needed and can maintain a single version of its services.

The fourth and most important constraint is HATEOAS. This is how RESTful services work without maintaining application state on the server side. By leveraging hypermedia as the engine of application state (HATE-OAS), the application state is represented by a series of links— uniform resource identifiers or URIs—on the client side, much like following the site map of a website by following the URLs. When a resource (i.e., server or connection) fails, the resource that resumes working on the services starts with the URI of the failed resource (the application state) and resumes processing.

A good analogy of HATEOAS is the way a GPS works in a car. Punch in a final destination on the GPS and the application returns a list of directions. You start driving by following these directions. The voice on the GPS tells you to turn when the next instruction is due. Let’s say you pull over for lunch and shut off the car. When you resume driving, the remaining directions in the trip list pick right up where they left off. This is exactly how REST works via hypermedia. A node failing is similar to shutting your car off for lunch and another node picking up where the failed node left off is similar to restarting the car and the GPS. Make sense?
Why are the four constraints of REST so important when building solutions in the cloud? The cloud, like the Internet, is a massive network of independent resources that are designed to be fault-tolerant. By following the constraints of REST, the software components that run in the cloud have no dependencies on the underlying infrastructure that may fail at any time. If these four constraints are not followed, it creates limitations on the application’s ability to scale and to fail over to the next available resource.

As with any architectural constraint, there are trade-offs. The more abstraction that is built into an architecture, the more flexible and agile the architecture will be, but it comes with a price. Building RESTful services correctly takes more up-front time because building loosely coupled services is a much more involved design process. Another trade-off is performance. Abstraction creates overhead, which can impact performance. There may be some use cases where the performance requirements far exceed the benefits of REST and, for that particular use case, another method may be required. There are other design issues to be aware of that are covered in the next section.

The Challenges of Migrating Legacy Systems to the Cloud One of the challenges companies have when they decide to port applications from on-premises to the cloud is that many of their legacy systems are reliant on ACID transactions. ACID (atomicity, consistency, isolation, durability) transactions are used to ensure that a transaction is complete and consistent. With ACID transactions, a transaction is not complete until it is committed and the data is up to date. In an onpremises environment where data may be tied to a single partition, forcing consistency is perfectly acceptable and often the preferred method.

In the cloud, there is quite a different story. Cloud architectures rely on Basically Available, Soft State, Eventually Consistent (BASE) transactions. BASE transactions acknowledge that resources can fail and the data will eventually become consistent. BASE is often used in volatile environments where nodes may fail or systems need to work whether the user is connected to a network or not. This is extremely important as we move into the world of mobile,
where connectivity is spotty at times. Getting back to the legacy system discussion, legacy systems often rely on ACID transactions, which are designed to run in a single partition and expect the data to be consistent. Cloud-based architectures require partition tolerance, meaning if one instance of a compute resource cannot complete the task, another instance is called on to finish the job. Eventually the discrepancies will be reconciled and life will go on its merry way. However, if a legacy system with ACID transactionality is ported and not modified to deal with partition tolerance, users of the system will not get the data consistency they are accustomed to and they will challenge the quality of the system. Architects will have to account for reconciling inconsistencies, which is nothing new. In retail they call that balancing the till, which is an old way of saying making sure the cash in the drawer matches the receipt tape at the end of the day. But many legacy applications were not designed to deal with eventual consistency and will frustrate the end users if they are simply ported to the cloud without addressing this issue. What about those mega-vendors out there whose legacy applications are now cloud-aware applications? Most of those rebranded dinosaurs are actually running in a single partition and don’t really provide the characteristics of cloud-based systems such as rapid elasticity and resource pooling. Instead, many of them are simply large, monolithic legacy systems running on a virtual machine at a hosted facility, a far cry from being a true cloud application. It is critical that architects dig under the covers of these vendor solutions and make sure that they are not being sold snake oil.
There is a new breed of vendors that offer cloud migration services. It is important to note that these solutions are simply porting the legacy
architecture as is. What that means is that if the legacy applications can only run in a single tenant, they will not be able to take advantage of
the elasticity that the cloud offers. For some applications, there may be no real benefit for porting them to the cloud.

Architecting solutions for cloud computing requires a solid understanding of how the cloud works. To build resilient solutions that scale, one must design a solution with the expectation that everything can and will fail. Cloud infrastructure is designed for high availability and is partition tolerant in nature. Migrating single-partition applications to the cloud makes the migration act more like a hosting solution rather than a scalable cloud solution. Building stateless, loosely coupled, RESTful services is the secret to thriving in this highly available, eventually consistent world. Architects must embrace this method of building software to take advantage of the elasticity that the cloud provides.

Bloomberg, J. (2013). The Agile Architecture Revolution: How Cloud Computing, REST-Based SOA, and Mobile Computing Are Changing Enterprise IT. Hoboken, NJ: John Wiley & Sons.
Bloomberg, J. (2011, June 1). “BASE Jumping in the Cloud: Rethink Data Consistency.” Retrieved from cloud-rethinking-data-consistency/.
Fielding, R. (2000). “Representational State Transfer (REST),” in “Architectural Styles and the Design of Network-based Software Architectures.” Ph.D. Dissertation, University of California, Irvine. Retrieved from
Hoff, T. (2013, May 1). “Myth: Eric Brewer on Why Banks Are BASE Not ACID—Availability Is Revenue.” Retrieved from on-why-banks-are-base-not-acid availability.html.
Architecting the Cloud: Design Decisions for Cloud Computing Service Models (SaaS, PaaS, and IaaS)



The question is should you use one firewall for perimeter and internal? Some would argue if an attacker can hack the first firewall then he/she can definitely attack the internal firewall then why should I spent for 2 firewalls. Another argument single firewall proponents bring to the table is if there is any configuration mismatch between internal and perimeter then there is no use for 2 firewalls. For example, if by human error if anything is bypassed then the doors for attackers are wide opened. In addition to this, you have to protect your environment from worms and malware as well which are in the form of bots and keep on looking for loopholes to exploit. So, don’t think someone in China or US is waiting for you to make mistake 🙂

If you read CISSP, one of the very basic design principles is to have the defence in-depth. This means you have layers security which would make the job hectic for the attackers, viz. IPS at the firewall, Anti-virus at the firewall, stateful inspection at the firewall, anomalies detection at the firewall, then security modules are routers, internal firewall, the WAF then firewall of the servers, then if possible host-level Anti-virus and IPS. See how many such checkpoints are there? Many right?

Likewise, government agencies suggest to use physical layer wherever possible, like using physical IPS, 2 firewalls etc but nowadays since virtualization came and everyone is looking to save cost the regulatory bodies are making such requirement mandatory because a converged setup with one physical hardware imposes risk if the hardware itself has the bug and been compromised. The risk and business impact would be huge including the brand image if any such incident occurs. Thus, one should be very careful in saving cost and not at the cost of brand image and lifeline of any company. Attackers are learning and no software is full proof save. Let throw some light on Defence-in-depth strategy.

A good Defense-in-Depth strategy involves many different technologies, such as Intrusion Detection, Content Filtering, and Transport Layer Security. The single most important element, however, is a system of internal firewalls. Proper deployment of these devices can address concerns that we have from security:

§ Employees will not have unrestricted access to the entire network, and their activity can be monitored.

§ Partners, customers, and suppliers can be given limited access to whatever resources they require, while maintaining isolation of critical servers.

§ Critical servers can be closely monitored when they are isolated behind an internal firewall. Any malicious activity would be much easier to detect, since the firewall has a limited amount of traffic passing through it.

§ Remote users can be restricted to certain portions of the network, and VPN traffic can be contained and easily monitored.

§ A security breach in one segment of the network will be limited to local machines, instead of compromising the security of the entire network. With a system of internal firewalls in place, we can come much closer to our ideal network. Instead of an all-or-nothing security posture, we can achieve Defense-in-Depth by forcing an attacker to penetrate multiple layers of security to reach mission-critical servers.


The best practice, and we have been engaged with medium to high-end customers including finance and insurance and have noticed dual firewall policy for DMZ and internal LAN and that is the most secure approach we recommend. It is also the most secure approach, according to Stuart Jacobs, is to use two firewalls to create a DMZ. The first firewall (also called the “front-end” or “perimeter” firewall) must be configured to allow traffic destined to the DMZ only. The second firewall (also called “back-end” or “internal” firewall) only allows traffic from the DMZ to the internal network.

This setup is considered more secure since two devices would need to be compromised. There is even more protection if the two firewalls are provided by two different vendors, because it makes it less likely that both devices suffer from the same security vulnerabilities. In a scenario when there is a bug on one firewall and that imposes threat on the whole infrastructure is minimized by having firewall from two different vendors. For example, accidental misconfiguration is less likely to occur the same way across the configuration interfaces of two different vendors, and a security hole found to exist in one vendor’s system is less likely to occur in the other one.

Azure Machine Learning Benefits & Pitfalls

Today, allow me to write something on Machine learning and give verdict on Azure Machine Learning and its pitfalls. Let’s start with what is machine learning, it’s the construction and study of algorithms that can learn from data. There are two approaches for machine learning and they are supervised learning and unsupervised learning. Decision making in ML is done through regression , classification & clustering are the decisions taken in ML to solve problems.

Some examples for supervised and unsupervised are given below for you to understand.

In unsupervised learning, data points have no labels associated with them. Instead, the goal of an unsupervised learning algorithm is to organize the data in some way or to describe its structure. This can mean grouping it into clusters or finding different ways of looking at complex data so that it appears simpler or more organized.

  1. To identify patterns in data – unsupervised learning
  2. Study the past – unsupervised learning
  3. Learning from the historical data to Predict / Recommend – Supervised learning

A supervised learning algorithm looks for patterns in those value labels. It can use any information that might be relevant—the day of the week, the season, the company’s financial data, the type of industry, the presence of disruptive geopolitical events—and each algorithm looks for different types of patterns. After the algorithm has found the best pattern it can, it uses that pattern to make predictions for unlabeled testing data—tomorrow’s prices.

Supervised learning is a popular and useful type of machine learning. With one exception, all the modules in Azure Machine Learning are supervised learning algorithms. There are several specific types of supervised learning that are represented within Azure Machine Learning: classification, regression, and anomaly detection.

The steps are below –

  1. Plan data storage , setup Environment and Preprocess data happens out side the ML system
  2. Setting up environment includes preparing Storage environment , pre processing environment and ML Workspace
  3. HDInsight can be used for preprocessing the data

Microsoft Azure Machine Learning, a fully-managed cloud service for building predictive analytics solutions, helps overcome the challenges most businesses have in deploying and using machine learning.

Now comes the pros and cons –

Benefits –

  1. No data limit for pulling data from Azure storages and hdfs system.
  2. Azure ML is a much friendlier set of tools, and it’s less restrictive on the quality of the training data
  3. Azure ML’s tools make it easy to import training data, and then tune the results
  4. On click publishing facilities make the data model published as web service
  5. Cost of maintenance is less compared to on premise analytics solutions
  6. Drag, drop and connect structures are available to make an experiment
  7. Built in R module , Support for python and options for custom R code for extensibility
  8. Security for Azure Ml Service relies on Azure security measures

Pitfalls –

×10 GB data limit for Flat file processing

×Predictive Model Mark-up Language is not supported, however custom R and Python code can be used to define a module

×There is no version control or Git integration for experiment graphs.

×Only Smaller amount of data can be read from systems like Amazon S3

Verdict – If you wish to run deep learning and need resources at times and not always, cloud is the fantastic option.

Cheers 🙂