Lower East Side, Manhattan
A Contract position at Premier New York Education Institution.
Pay Options: IC - Self Incorporated or w2.
Contact Jessica Ohmer. call (646)876-9549 / (212)616-4800 ext.570 or email firstname.lastname@example.org with the Job Code JO33471 or Click the Apply Now button ().
Location: 726 Broadway.
Skills required for the position: BIG DATA, ETL, MULESOFT.
Detailed Info: A unique opportunity for data engineer for Learning & Analytics project. As part of this team, you will work on the collecting, storing, processing, and analyzing huge sets of data. The primary focus will be to develop the construction and maintenance of our data pipeline, ETL processes and data warehouse. Data Engineer will also be responsible for data quality and understanding the data needs our various source data in order to anticipate and scale our systems.
Current technology includes (but is not limited to):
Mulesoft API Management
Change Data Capture (CDC)
Data Ingestion to preparation to exploration and consumption using Cloud & Big Data Platform
Tableau as Business Intelligence (BI) tools
Dimensional and Relation table structures
AWS cloud (S3, EC2, EMR, Redshift, etc.)
Snowflake, Attunity, Airflow, Databricks
Development/Computing Environment: Roles & responsibilities may include:
Integrate data from a variety of data sources (Data warehouse, Data marts) utilizing on-premises or cloud-based data structures using Snowflake;
Develop and implement streaming, data lake, and analytics big data solutions
Create integration of data from multiple data sources, knowledge of various ETL techniques and frameworks using Snowflake or Databricks
Create Applications using Change Data Capture Tools
Technical Support (includes trouble shooting, monitoring)
Technical Analyst and Project Management Support
Application Performance and System Testing Analyst
"Ideal" candidates will have the following experience, knowledge, skills or abilities:
Utilize ETL processes to build data repositories using Snowflake, Python, etc; integrate data into Data Lake using Spark, PySpark, Hive, Kafka Streaming, etc.
Application development, including Cloud development experience, preferably using AWS (AWS Services, especially S3, API Gateway, Redshift, Lambda, etc.)
Must have expertise in the following: Python/R, SQL, Java
Navigational flow, working with Notebooks, scheduling, integrating with AWS for cloud storage
Working with different file formats: Hive, Parquet, CSV, JSON, Avro etc. Compression techniques
Integrating PySpark with different data sources, example: Snowflake, Oracle, Postgres Mysql, etc.
Experience with web services such as AWS, Redshift, S3, RDS, Athena, Dynamo DB and Aurora.
The position offers competitive rate.
Job Id: 33471