“For we are His workmanship, created in Christ Jesus for good works, which God prepared beforehand, that we should walk in them.”

IBM says that: “Data science combines math and statistics, specialized programming, advanced analytics, artificial intelligence (AI) and machine learning with specific subject matter expertise to uncover actionable insights hidden in an organization’s data. These insights can be used to guide decision making and strategic planning.
Data scientists are the practitioners within the field of data science.
Data scientists may also work with data engineers, machine learning engineers, software engineers, and data analysts to produce a useful product.
Public servants in the government & business leaders both leverage data to make productive decisions with their budgets. Companies such as: Walmart, Amazon, & Netflix, rely on data to generate valuable insights. The Federal Social Security Administration has analytical models in play that improve claims processing. Law enforcement is beginning to participate in the insights data scientists & analysts bring to the table. Scientists, like those submitting work to the National Institute of Health, can choose to share data allowing for its reuse for further, additional insights.
Recommended limits to what data, data scientists can share, can be found here.
Data Science vs. Machine Learning vs. Artificial Intelligence
Within the technology field today: data scientists, machine learning engineers, & artificial intelligence engineers are in-demand.
While these titles & fields are often thought to be interchangeable (by the public), there are key differences.
Similarities:
(1) Each of these three fields builds a foundation in the collection, organization, and analysis of data.
(2) Iterative Evolution: these fields each refine their algorithms & models through constant ieration & improvement.
(3) The Pursuit of Predictive Power based on forecasted future trends (data science), informed guesses based on discerned patterns from algorithms (machine learning), & the anticipation of preferences/behaviors (artificial intelligence).
Differences:
The field of artificial intelligence extends beyond data manipulation into areas such as robotics, computer vision & natural language modeling. A.I. aims to produce machines that perform tasks. These machine-produced tasks are often meant to replace a human being from doing the same task.
Machine learning emphasizes having machines make predictions based on patterns within data & often encompasses greater software development tactics than a data scientist would typically utilize. The machine learning engineer wants to design & implement a system that allows the computer system to make it’s own decisions based on data it is fed.
IBM provides this information about their four-stage approach to the Data Life Cycle:
- Stage 1: Data Ingestion
- The lifecycle begins here with data collection.
- The data is raw & can be structured or unstructured.
- The collection comes from a variety of methods: manual entry, web scraping, real-time streaming data, etc. . . .
- The sources of data can include: customer data, log files, video, audio, pictures, IoT, social media, and more. . .
- Stage 2: Data Storage & Data Processing
- Data is stored & structured by data management teams.
- The different formats & structures of data collected will influence the type of storage method utilized.
- Data warehouses, date lakes, or other repositories are utilized.
- Cleaning data, deduplicating, transforming & combining the data using Extract/Transform/Load (ELT) are all aspects of this stage.
- Stage 3: Data Analysis
- Biases, patterns, ranges, & distributions of values within the data are found.
- This exploration helps create hypothesis generation for a/b testing.
- Analysts utilize this stage to see if the data is relevant & useful to integrate into models “for predictive analytics, machine learning, and/or deep learning.”
- Biases, patterns, ranges, & distributions of values within the data are found.
- Stage 4: Communicate
- Reports & other data visualizations are utilized to help the decision-makers to understand the value of the data.
Five Stages of the Data Life Cycle
The UC Berkeley School of Information provides this information about their five-stage approach to the Data Life Cycle:
- Stage 1: Capture
- Data acquisition, data entry, signal reception, data extraction
- Stage 2: Maintain
- Data warehousing, data cleansing, data staging, data processing, data architecture
- Stage 3: Process
- Data mining, clustering/classification, data modeling, data summarization
- Stage 4: Analyze
- Exploratory/confirmatory, predictive analysis, regression, text mining, qualitative analysis
- Stage 5: Communicate
- Data reporting, data visualization, business intelligence, decision making
“Today, effective data scientists masterfully identify relevant questions, collect data from a multitude of different data sources, organize the information, translate results into solutions, and communicate their findings in a way that positively affects business decisions.
These skills are now required in almost all industries, which means data scientists have become increasingly valuable to companies.” – U.C. Berkeley
Those in the data science field utilize tools & languages such as:
Python, “R”, SQL, SQL-Variants: PostgreSQL, Tableau, Power BI, Bokeh, Plotly, Infogram, Excel, Apache Spark, TensorFlow, MLflow, Pytorch, RapidMiner, and Hugging Face . . . .
Abid All Awan, blogger for DataCamp, believes the Top Ten Data-Science Tools for 2024 are:
(1) Pandas
(2) Seaborn
(3) Scikit-learn
(5) Pytorch
(6) MLFlow
(7) Hugging Face
(8) Tableau
(9) RapidMiner
(10) ChatGPT

Cloud Computing’s Role for Data Science
Access to cloud computing power allows data scientist “additional processing power, storage, and other tools . . . .”
Scalability is vital for data sets that can change & grow large in a time-sensitive manner. The cloud can provide access to data lakes, which allow large volumes of data to be ingested & processed with ease. Additional compute nodes can be added with additional cost, for a short-term cost that provides a potential long-term payoff.
“For we are His workmanship, created in Christ Jesus for good works, which God prepared beforehand, that we should walk in them.”
Leave a Reply