Oozie Interview Questions
1) Types of Oozie Jobs
- Periodical/Coordinator Job: These are recurrent jobs which run based on a particular time or they can be configured to run when data is available.
- Coordinator jobs can manage multiple workflow based jobs as well as where the output of one workflow can be the input for another workflow.
- The chained behavior is known as “data application pipeline”.
- Oozie Hadoop Workflow: It is Directed Acyclic Graph (DAG) which consists of collection of actions.
- The Control nodes decide the chronological order, setting of rules, execution path decision, joining the nodes and fork.
- Whereas, Action node triggers the execution.
- Oozie Bundle: An Oozie bundle is collection of many coordinator jobs which can be started, suspended and stopped periodically.
- The jobs in this bundle are usually dependent on each other.
2) Oozie Architecture
- Oozie Architecture has a Web Server and a database for storing all the jobs.
- The default web server is Apache Tomcat, which is the open source implementation of Java Servlet Technology.
- Oozie server is a stateless web application and does not store any information regarding the user and job in-memory.
- All this information is stored in the SQL database and Oozie retrieves the job state from the database at the time of processing the request.
- The users or Oozie clients can interact with the server, using either the command line tool, Java Client API or, HTTP REST API.
- Periodical/Coordinator Job: These are recurrent jobs which run based on a particular time or they can be configured to run when data is available.
- Coordinator jobs can manage multiple workflow based jobs as well as where the output of one workflow can be the input for another workflow.
- The chained behavior is known as “data application pipeline”.
- Oozie Hadoop Workflow: It is Directed Acyclic Graph (DAG) which consists of collection of actions.
- The Control nodes decide the chronological order, setting of rules, execution path decision, joining the nodes and fork.
- Whereas, Action node triggers the execution.
- Oozie Bundle: An Oozie bundle is collection of many coordinator jobs which can be started, suspended and stopped periodically.
- The jobs in this bundle are usually dependent on each other.
2) Oozie Architecture
- Oozie Architecture has a Web Server and a database for storing all the jobs.
- The default web server is Apache Tomcat, which is the open source implementation of Java Servlet Technology.
- Oozie server is a stateless web application and does not store any information regarding the user and job in-memory.
- All this information is stored in the SQL database and Oozie retrieves the job state from the database at the time of processing the request.
- The users or Oozie clients can interact with the server, using either the command line tool, Java Client API or, HTTP REST API.
Comments
Post a Comment