Job Openings >> Data Engineer
Data Engineer
Summary
Title:Data Engineer
ID:1033
Location:Remote, USA
Department:Information Technology
Description

Role Overview

We are seeking a Data Engineer to design, build, and scale the data infrastructure behind a next-generation political intelligence and relationship analytics platform. This role focuses on integrating diverse data sources, powering advanced search and knowledge graph capabilities, and enabling predictive analytics and network visualization.

You will work at the intersection of data engineering, cloud architecture, and applied analytics, helping transform structured and unstructured public policy, legislative, and stakeholder data into a high-performance, queryable intelligence platform.

Modali Consulting believes in giving each person a chance to succeed and so you will be given the opportunity and autonomy to deliver quality, timely projects to our diverse client base.

Who you are. You rank high in conscientiousness – you bring structure, organization, and promptness to everything you do. But you also have the ability to match this with creative problem solving and innovative thinking. You are probably the kind of person who is always learning something new, and in fact, intellectual pursuits may be your driving motivation. Above all, you seek to do good and right by people. Integrity may be your most admired attribute.

This is a US-Remote role. You must be based in the US to apply.

TO APPLY: Please submit an application using the link on our website, modaliconsulting.com. 

 


What This Job Entails

1. Data Pipeline & Integration Engineering

  • Design and maintain scalable ETL/ELT pipelines for ingesting structured and unstructured data

  • Integrate diverse data sources such as:

    • Legislative records

    • Voting data

    • Committee assignments

    • Lobbying and relationship data

    • Biographical and employment history data

  • Implement automated data ingestion, transformation, cleansing, and normalization processes

  • Support batch and near real-time data processing workflows

 


2. Knowledge Graph & Advanced Data Modeling

  • Build and maintain graph-based data models representing relationships between people, organizations, legislation, and events

  • Work with graph databases (e.g., AWS Neptune or similar) to enable multi-level relationship analysis

  • Optimize data structures for complex, relationship-driven queries

  • Support features like “follow-the-money,” influence mapping, and network exploration

 


3. Search & Retrieval Infrastructure

  • Engineer data pipelines and indexing strategies to support advanced full-text and semantic search

  • Integrate search technologies (e.g., OpenSearch/Elastic, SOLR) with structured and graph-based data

  • Support enhanced search use cases including cross-dataset queries and contextual results

  • Work with vector or embedding-based retrieval to support intelligent data discovery

 


4. Cloud Data Architecture

  • Build and manage cloud-native data infrastructure, primarily in AWS

  • Work with services such as:

    • S3 (data lake/storage)

    • Graph databases (e.g., Neptune)

    • API layers for data access

    • Streaming and event-driven components where needed

  • Implement scalable, secure, and cost-efficient data architectures

 


5. Analytics & Data Enablement

  • Prepare curated datasets for:

    • Predictive analytics

    • Behavioral and trend modeling

    • Reporting and visualization

  • Partner with data scientists and ML engineers to productionize models

  • Ensure data is structured and accessible for downstream dashboards, visualizations, and applications

 


6. Data Quality, Governance & Reliability

  • Implement data validation, monitoring, and anomaly detection

  • Maintain metadata, lineage, and documentation for key datasets

  • Support repeatable deployments using CI/CD and infrastructure-as-code practices

  • Contribute to data security and access control design

 


Required Qualifications

  • 3–7+ years of experience in data engineering or backend data platform roles

  • Strong experience building ETL/ELT pipelines using Python, SQL, or similar languages

  • Experience working with cloud data platforms (AWS preferred)

  • Hands-on experience with relational and non-relational databases

  • Experience modeling complex datasets and optimizing for analytical queries

  • Familiarity with search platforms (Elastic, OpenSearch, or SOLR)

  • Strong SQL skills and data transformation experience

  • Experience working with APIs and integrating external data sources

 


Preferred / Nice-to-Have Qualifications

  • Experience with graph databases (Neptune, Neo4j, etc.)

  • Exposure to knowledge graphs or relationship-driven data modeling

  • Experience supporting machine learning or NLP pipelines

  • Familiarity with vector databases or embedding-based search

  • Experience with public sector, policy, legislative, or advocacy data

  • Experience building data platforms for analytics products or SaaS applications

 


Prerequisites

  • Ability to work in a fast-paced, iterative product environment

  • Strong problem-solving skills and ability to translate business questions into data structures

  • Comfort working with incomplete, messy, or evolving data sources

  • Ability to collaborate with engineers, analysts, designers, and product stakeholders

  • Strong documentation and communication skills

 


What Success Looks Like in This Role

  • Data from multiple complex sources is reliably ingested, cleaned, and connected

  • The platform supports fast, complex relationship queries

  • Search results are relevant, structured, and scalable

  • Data is trusted, well-documented, and analytics-ready

  • The data layer enables advanced capabilities like network visualization, predictive insights, and influence analysis

ApplicantStack powered by Swipeclock