Oozie Interview Questions

1) Types of Oozie Jobs
- Periodical/Coordinator Job: These are recurrent jobs which run based on a particular time or they can be configured to run when data is available.
- Coordinator jobs can manage multiple workflow based jobs as well as where the output of one workflow can be the input for another workflow.
- The chained behavior is known as “data application pipeline”.
- Oozie Hadoop Workflow: It is Directed Acyclic Graph (DAG) which consists of collection of actions.
- The Control nodes decide the chronological order, setting of rules, execution path decision, joining the nodes and fork.
- Whereas, Action node triggers the execution.
- Oozie Bundle: An Oozie bundle is collection of many coordinator jobs which can be started, suspended and stopped periodically.
- The jobs in this bundle are usually dependent on each other.

2) Oozie Architecture
- Oozie Architecture has a Web Server and a database for storing all the jobs.
- The default web server is Apache Tomcat, which is the open source implementation of Java Servlet Technology.
- Oozie server is a stateless web application and does not store any information regarding the user and job in-memory.
- All this information is stored in the SQL database and Oozie retrieves the job state from the database at the time of processing the request.
- The users or Oozie clients can interact with the server, using either the command line tool, Java Client API or, HTTP REST API.


Popular posts from this blog

Hive Related Errors and fixes

HBase Interview Questions

Hive Interview Questions