Data Engineer

Summary

Title:	Data Engineer
ID:	1033
Location:	Remote, USA
Department:	Information Technology

More about this job >

Description

Role Overview

We are seeking a Data Engineer to design, build, and scale the data infrastructure behind a next-generation political intelligence and relationship analytics platform. This role focuses on integrating diverse data sources, powering advanced search and knowledge graph capabilities, and enabling predictive analytics and network visualization.

You will work at the intersection of data engineering, cloud architecture, and applied analytics, helping transform structured and unstructured public policy, legislative, and stakeholder data into a high-performance, queryable intelligence platform.

Modali Consulting believes in giving each person a chance to succeed and so you will be given the opportunity and autonomy to deliver quality, timely projects to our diverse client base.

Who you are. You rank high in conscientiousness – you bring structure, organization, and promptness to everything you do. But you also have the ability to match this with creative problem solving and innovative thinking. You are probably the kind of person who is always learning something new, and in fact, intellectual pursuits may be your driving motivation. Above all, you seek to do good and right by people. Integrity may be your most admired attribute.

This is a US-Remote role. You must be based in the US to apply.

TO APPLY: Please submit an application using the link on our website, modaliconsulting.com.

What This Job Entails

1. Data Pipeline & Integration Engineering

Design and maintain scalable ETL/ELT pipelines for ingesting structured and unstructured data
Integrate diverse data sources such as:
- Legislative records
- Voting data
- Committee assignments
- Lobbying and relationship data
- Biographical and employment history data
Implement automated data ingestion, transformation, cleansing, and normalization processes
Support batch and near real-time data processing workflows

2. Knowledge Graph & Advanced Data Modeling

Build and maintain graph-based data models representing relationships between people, organizations, legislation, and events
Work with graph databases (e.g., AWS Neptune or similar) to enable multi-level relationship analysis
Optimize data structures for complex, relationship-driven queries
Support features like “follow-the-money,” influence mapping, and network exploration

3. Search & Retrieval Infrastructure

Engineer data pipelines and indexing strategies to support advanced full-text and semantic search
Integrate search technologies (e.g., OpenSearch/Elastic, SOLR) with structured and graph-based data
Support enhanced search use cases including cross-dataset queries and contextual results
Work with vector or embedding-based retrieval to support intelligent data discovery

4. Cloud Data Architecture

Build and manage cloud-native data infrastructure, primarily in AWS
Work with services such as:
- S3 (data lake/storage)
- Graph databases (e.g., Neptune)
- API layers for data access
- Streaming and event-driven components where needed
Implement scalable, secure, and cost-efficient data architectures

5. Analytics & Data Enablement

Prepare curated datasets for:
- Predictive analytics
- Behavioral and trend modeling
- Reporting and visualization
Partner with data scientists and ML engineers to productionize models
Ensure data is structured and accessible for downstream dashboards, visualizations, and applications

6. Data Quality, Governance & Reliability

Implement data validation, monitoring, and anomaly detection
Maintain metadata, lineage, and documentation for key datasets
Support repeatable deployments using CI/CD and infrastructure-as-code practices
Contribute to data security and access control design

Required Qualifications

3–7+ years of experience in data engineering or backend data platform roles
Strong experience building ETL/ELT pipelines using Python, SQL, or similar languages
Experience working with cloud data platforms (AWS preferred)
Hands-on experience with relational and non-relational databases
Experience modeling complex datasets and optimizing for analytical queries
Familiarity with search platforms (Elastic, OpenSearch, or SOLR)
Strong SQL skills and data transformation experience
Experience working with APIs and integrating external data sources

Preferred / Nice-to-Have Qualifications

Experience with graph databases (Neptune, Neo4j, etc.)
Exposure to knowledge graphs or relationship-driven data modeling
Experience supporting machine learning or NLP pipelines
Familiarity with vector databases or embedding-based search
Experience with public sector, policy, legislative, or advocacy data
Experience building data platforms for analytics products or SaaS applications

Prerequisites

Ability to work in a fast-paced, iterative product environment
Strong problem-solving skills and ability to translate business questions into data structures
Comfort working with incomplete, messy, or evolving data sources
Ability to collaborate with engineers, analysts, designers, and product stakeholders
Strong documentation and communication skills

What Success Looks Like in This Role

Data from multiple complex sources is reliably ingested, cleaned, and connected
The platform supports fast, complex relationship queries
Search results are relevant, structured, and scalable
Data is trusted, well-documented, and analytics-ready
The data layer enables advanced capabilities like network visualization, predictive insights, and influence analysis

Apply Now

Refer to a Friend