CLOUD COMPUTING

UNIT-II

Syllabus

UNIT-II

Cloud Enabling Technologies, Ubiquitous Internet, parallel and distributed computing, elements of parallel computing, hardware architectures for parallel computing (SISD, SIMD, MISD, MIMD), elements of distributed computing, Inter-process communication, technologies for distributed computing, remote procedure calls (RPC), service-oriented architecture (SOA), Web services, virtualization.

Topic-wise textbooks/references

2.1 Cloud Enabling Technologies

2.2 Ubiquitous Internet

2.3 parallel and distributed computing

2.4 elements of parallel computing

hardware architectures for parallel computing (SISD, SIMD, MISD, MIMD)

2.5 elements of distributed computing

Inter-process communication

2.6 technologies for distributed computing

remote procedure calls (RPC)

2.7 service-oriented architecture (SOA)

Web services

2.8 virtualization

2.1 Cloud Enabling Technologies:

Amazon Web Services (AWS): This is a comprehensive, broadly adopted, and leading cloud platform offered by Amazon. It provides a vast array of services, including computing power (e.g., EC2 for virtual machines), storage (e.g., S3 for object storage), databases, analytics, machine learning, networking, mobile, developer tools, and more. AWS is a prime example of a public cloud offering Infrastructure as a Service (IaaS) and many other service models.

Google App Engine: This is a Platform as a Service (PaaS) offered by Google Cloud. It allows developers to build and host web applications on Google's scalable infrastructure without managing the underlying servers. It supports various programming languages and automatically scales applications based on demand.

Microsoft Azure: This is Microsoft's cloud computing platform, similar to AWS and Google Cloud. Azure offers a wide range of services, including IaaS (virtual machines, networking), PaaS (app services, databases), and Software as a Service (SaaS) offerings. It caters to a broad spectrum of computing needs, from simple web hosting to complex enterprise solutions.

Hadoop: This is an open-source framework for distributed storage and processing of large datasets across clusters of computers. It's not a cloud service provider itself, but rather a foundational technology for big data analytics. Key components include HDFS (Hadoop Distributed File System) for storage and MapReduce for processing. Many cloud providers offer managed Hadoop services.

Force.com and Salesforce.com: These are closely related. • Salesforce.com: Is primarily known as a leading provider of Software as a Service (SaaS), especially for Customer Relationship Management (CRM). It offers various cloud-based applications for sales, service, marketing, and more. • Force.com: Is the underlying Platform as a Service (PaaS) that Salesforce.com applications are built upon. It allows developers to build custom applications that integrate with Salesforce CRM and leverage its robust infrastructure, database, and user interface components. Essentially, Salesforce.com is the product, and Force.com is the platform on which products can be built.

Manjrasoft Aneka: This is a PaaS (Platform as a Service) software framework that allows developers to build and deploy distributed applications on private or hybrid clouds. It focuses on providing a flexible environment for parallel and distributed programming, supporting various programming models (like Bag of Tasks, MapReduce). Aneka aims to simplify the development and deployment of applications that can leverage the power of cloud computing environments.

2.2 Ubiquitous Internet:

Ubiquitous = present everywhere
“Ubiquitous Internet” means that internet connectivity is available anytime, anywhere, and on any device.
So, users, devices, and applications can seamlessly connect to services without location constraints.
Ubiquitous Internet = The backbone that connects users and devices to the cloud anytime, anywhere.
Without ubiquitous Internet, cloud computing loses its core value proposition of on-demand, location-independent, device-agnostic services.
Example Use Case : A company uses a SaaS CRM (like Salesforce).
- Sales teams in different countries can log in anytime.
- Managers can get real-time dashboards on their phones.
- Data syncs seamlessly because of ubiquitous connectivity.
The Ubiquitous Internet is a fundamental enabler of cloud computing because it:
- Connects users to cloud services: Users can access data, apps, and infrastructure hosted in the cloud from anywhere.
  - Example: Accessing Google Drive, Office 365, or AWS from a laptop, tablet, or phone.
- Facilitates on-demand resources: Cloud computing relies on delivering resources (compute, storage, applications) over the Internet. The always-available Internet makes this possible for globally distributed users.
- Supports multi-device access: One user can switch between devices (laptop → tablet → smartphone) and still access the same cloud services.
- Enables edge and IoT integration: Smart devices and IoT sensors use ubiquitous connectivity to send data to the cloud for processing and analytics.
- Benefits of Ubiquitous Internet in Cloud Computing:
  - Accessibility: 24/7 availability of services and data.
  - Scalability: Cloud providers can serve millions of users globally.
  - Flexibility: Work and collaborate from any location.
  - Cost-effectiveness: Organizations don’t need on-premises infrastructure for remote workers.
  - Innovation: Enables new apps like smart cities, connected vehicles, and real-time collaboration tools.

2.3 parallel and distributed computing:

Computing: dominated by two fundamental models: sequential and parallel.
Sequential computing began in the 1940s, and parallel (and distributed) computing emerged within the following decade.
Progress in computing eras started with hardware architectures, which enabled system software such as compilers and operating systems, and then applications.
Applications and systems became central as problem-solving environments appeared, allowing engineers to work effectively and marking the maturity and mainstream adoption of a paradigm.
Each era’s aspects passed through three phases: research and development, commercialization, and commoditization.

Parallel Computing

Tightly coupled
Shared memory
Homogeneous processors
Usually one physical system
Via shared memory
Multi-core processors, supercomputers

Ex: GPUs, multi-core VMs

Distributed Computing

Often loosely coupled
May have no shared memory
Often heterogeneous nodes
Often geographically dispersed
Via network messages
Cloud services, grids, Internet computing

Ex: Apache Hadoop, Apache Spark, Kubernetes

2.4 elements of parallel computing:

2.4.1 parallel processing

Processing of multiple tasks simultaneously on multiple processors is called parallel processing

A given task is divided into multiple subtasks using divide-and-conquer technique, and each one of them is processed on different CPUs
Programming on multi–processor system using divide-and-conquer technique is called parallel programming.
Various factors influencing parallel computing:
- increase in computing requirements in the fields of research, scientific, business, aerospace, GIS, mechanical design and analysis, etc.,
- sequential archtiectures reaching limitations
- hardware improvments in pipelining, superscalar,...
- vector processing works well for scientfic but not database
- technology of parallel processing has become commercial

2.4.2 Hardware architectures for parallel processing:

Single Instruction Single Data (SISD)

Single Instruction Multiple Data (SIMD)

Multiple Instruction Single Data (MISD)

Multiple Instruction Multiple Data (MIMD)

SISD:

SISD uses a single processor executing one instruction stream on one data stream, making it a sequential computing model.
Instructions in SISD are executed one after another, so such systems are commonly known as sequential computers.
Most conventional systems like desktops and workstations follow the SISD model architecture.
Both program instructions and data must reside in primary memory for processing in SISD systems.
Overall processing speed is mainly constrained by the internal data transfer rate within the computer.

SIMD:

SIMD uses multiple processors executing the same instruction on different data streams in parallel.
It is especially suitable for scientific workloads rich in vector and matrix operations.
Vector statements like Ci = Ai * Bi can be broadcast to all processing elements simultaneously.
Data from vectors A and B is partitioned into N sets so each of the N processing elements handles one set.
Classic SIMD examples include CRAY vector processors and Thinking Machines’ CM series.

MISD:

MISD is a multiprocessor system where different instructions run on multiple processing elements, all using the same data set.
Example operations like y=sin(x)+cos(x)+tan(x) apply different computations to the identical input
Such MISD architectures are rarely useful for practical applications, so only a few experimental machines were ever built.
No major commercial systems follow the MISD model, which remains largely a theoretical or intellectual architecture.

MIMD:

MIMD systems use multiple processors, each with its own instruction and data stream, executing asynchronously and handling diverse applications.
Shared-memory MIMD connects all processors to a single global memory, enabling communication via shared variables but causing contention and lower scalability.
Distributed-memory MIMD gives each processor its own local memory, with communication via message-passing over an interconnection network such as tree or mesh.
Shared-memory MIMD is easier to program but less fault-tolerant and scalable than distributed-memory MIMD, making distributed designs more popular in modern systems.

2.5 elements of distributed computing:

2.5.1 concepts

Def1: A distributed system is a collection of independent computers that appears to its users as a single coherent system

Def2: A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages.

2.5.2 Components

Layered stack: Distributed systems span from hardware and networking up to operating systems, middleware, and applications, appearing as a single coherent system.

Hardware & OS: Networked computers and parallel hardware form the physical base, managed by the OS for IPC, process scheduling, and resource management.

Standards & protocols Hardware, network, and OS standards (e.g., TCP/IP, UDP) let heterogeneous components interoperate as one uniform system.

Middleware role Middleware builds on OS services to offer its own protocols, data formats, and programming frameworks, hiding lower-level heterogeneity.

Uniform programming model Developers program against middleware APIs, gaining a consistent abstraction independent of specific machines or operating systems.

Applications & services Applications sit on top, using middleware to deliver services (often via GUIs or web interfaces) to end users.

Cloud example In clouds, hardware/OS form IaaS, middleware provides virtual hardware and development platforms (PaaS), and applications are delivered as SaaS to support use cases like social networks and scientific computing.

2.5.3 Architectural Sytles:

Role of middleware Architectural style mainly concerns middleware, which structures components and interactions to provide a coherent runtime for distributed applications.

Purpose of styles Architectural styles define typical components, connectors, and constraints on their composition, helping classify and reason about distributed software structures.

Components vs connectors Components encapsulate functionality (programs, objects, processes), while connectors are communication mechanisms (IPC, events, protocols) coordinating them.

Software styles Software architectural styles describe logical organization (e.g., data-centered, data-flow, virtual machine, call-and-return, independent components).

System styles System architectural styles describe physical deployment, notably client–server and peer-to-peer arrangements over the network.

Client–server model Client–server centralizes services on servers accessed by clients, supports multi-tier (two-tier, three-tier, N-tier) decompositions, and is dominant but has scalability limits.

Peer-to-peer model Peer-to-peer gives each node both client and server roles, better supporting large, decentralized systems but complicating algorithm design and management.

Tutorial Questions / Two-Mark Question Bank

1. Define Parallel Computing.

Ans: Parallel computing is the simultaneous use of multiple compute resources (like multiple processors) to solve a computational problem by breaking it into smaller discrete parts that can be solved concurrently.

2. Define Distributed Computing.

Ans: Distributed computing is a field where components of a software system are shared among multiple computers to improve efficiency and performance, communicating via a network to achieve a common goal.

3. What is SISD in Flynn’s Taxonomy?

Ans: Single Instruction, Single Data (SISD) is a computer architecture where a single processor executes a single instruction stream to operate on data stored in a single memory. (Standard traditional PC).

4. What is MIMD?

Ans: Multiple Instruction, Multiple Data (MIMD) is a technique where multiple autonomous processors simultaneously execute different instructions on different data.

5. What is SIMD?

Ans: Single Instruction, Multiple Data (SIMD) describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously (common in Graphics Processing Units).

6. What is RPC (Remote Procedure Call)?

Ans: RPC is a protocol that one program can use to request a service from a program located in another computer on a network without having to understand the network's details.

7. Define Service-Oriented Architecture (SOA).

Ans: SOA is an architectural style where software components are designed as reusable services that communicate with each other over a network using a standard protocol (usually Web Services).

8. What are Web Services?

Ans: A Web Service is a software system designed to support interoperable machine-to-machine interaction over a network, typically using HTTP, XML, or JSON.

9. What is Inter-Process Communication (IPC)?

Ans: IPC refers to the mechanisms an operating system provides to allow the processes to manage shared data. Ideally, it allows processes to communicate and synchronize their actions.

10. What is "Ubiquitous Internet"?

Ans: It refers to the concept where internet connectivity is available everywhere and anywhere, enabling devices to connect to the cloud from any location, which is a fundamental requirement for Cloud Computing.

ASSIGNMENT QUESTIONS

Google Sites

Report abuse