The data science team must learn and investigate the problem, develop context and understanding and learn about the data sources needed and available for the project.
Defining
Research Goals
• To
understand the project, three concept must understand: what, why and how.
a) What
is expectation of company or organization?
b) Why
does a company's higher authority define such research value?
c) How
is it part of a bigger strategic picture?
• Goal
of first phase will be the answer of these three questions.
• In
this phase, the data science team must learn and investigate the problem,
develop context and understanding and learn about the data sources needed and
available for the project.
1. Learning the business domain :
•
Understanding the domain area of the problem is essential. In many cases, data
scientists will have deep computational and quantitative knowledge that can be
broadly applied across many disciplines.
• Data
scientists have deep knowledge of the methods, techniques and ways for applying
heuristics to a variety of business and conceptual problems.
2. Resources :
• As
part of the discovery phase, the team needs to assess the resources available
to support the project. In this context, resources include technology, tools,
systems, data and people.
3. Frame the problem :
• Framing
is the process of stating the analytics problem to be solved. At this point, it
is a best practice to write down the problem statement and share it with the
key stakeholders.
• Each
team member may hear slightly different things related to the needs and the
problem and have somewhat different ideas of possible solutions.
4. Identifying key stakeholders:
• The
team can identify the success criteria, key risks and stakeholders, which
should include anyone who will benefit from the project or will be
significantly impacted by the project.
• When
interviewing stakeholders, learn about the domain area and any relevant history
from similar analytics projects.
5. Interviewing the analytics sponsor:
• The
team should plan to collaborate with the stakeholders to clarify and frame the
analytics problem.
• At the
outset, project sponsors may have a predetermined solution that may not
necessarily realize the desired outcome.
• In
these cases, the team must use its knowledge and expertise to identify the true
underlying problem and appropriate solution.
• When
interviewing the main stakeholders, the team needs to take time to thoroughly
interview the project sponsor, who tends to be the one funding the project or
providing the high-level requirements.
• This
person understands the problem and usually has an idea of a potential working
solution.
6. Developing initial hypotheses:
• This
step involves forming ideas that the team can test with data. Generally, it is
best to come up with a few primary hypotheses to test and then be creative
about developing several more.
• These
Initial Hypotheses form the basis of the analytical tests the team will use in
later phases and serve as the foundation for the findings in phase.
7. Identifying potential data sources:
•
Consider the volume, type and time span of the data needed to test the
hypotheses. Ensure that the team can access more than simply aggregated data.
In most cases, the team will need the raw data to avoid introducing bias for
the downstream analysis.
Foundation of Data Science: Unit I: Introduction : Tag: : Data Science - Defining Research Goals
Foundation of Data Science
CS3352 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation