Data Collection should follow what model?
ISTAR
I - Intake & Orientation (What is the aim of the investigation, what are you looking for, where are you going to look, how much time do you have? etc)
S - Strategy, Search & Store (come up with a strategy for your searching & how are you going to store and manage your actions & results)
T - Technical Capabilities & Tactical Application (yours & the internet, what apps, software, automation do you have?
A - Analysis (GUIDs) (a continual process - especial when GUIDs are located that may allow attribution to a device or person)
R - Refine Recycle & Reporting (refine and redo seraches as you go bassed on results, then complete a report detailing your findings).
Information Retrieval (Gathering of online info)
Information retrieval comes from information needs.
4 rough types of information needs:
Search Methods - SIS v Thorough
SIS - Short Internet Scan. ‘quick & dirty’, no real plan, quick, shallow, google mostly used on one or two key words which will get a result, used by most people. (Men v women in result page)
Thorough Search - with a plan, using differnt methods, based on ISTAR
ISTAR - Intake & orientation
Be a super seracher.
Ask:
- what am I looking for?
- Why?
- Who for?
- What purpose are you going to use the information for
- What info do we already have
- what damage can be done (risk - organisational or personal)?
- How much time do we have?
Remember the 7 golden W’s who, what, why, where, when, what way and with what (how)
ISTAR - S - Strategies, Search & Store. Desribe 5 analytical serach strategies.
Search styles & strategies:
Analytical search strategies
ISTAR - S - Search Strategy & Store (cont) Yeild, Precision & Recall.
Not just about quantity of results - how much of them are relevant?
YEILD = quantity of results you get back
PRECISION & RECALL are statistical classifications.
PRECISION is a measure of EXACTNESS (how relevant the results are).
It is calculated by the number of RELEVANT documents retrieved DIVIDED by the TOTAL number of documents retrieved.
A perfect precision score of 1 means all the results that were returned were relevant.
But doesn’t tell us if all relevant documents were retrieved.
RECALL is a measure of COMPLETENESS.
It is calculated by the number of the relevant documents retrieved from a search DIVIDED by the TOTAL number of all existing relevant documents that should have been retrieved.
A perfect recall score of 1 means that all the relevant documents that exist were retrievedd, but it does not tell us how many irrelevant documents were also obntained.
RECALL & PRECISION therefore have a relationship, you can incerasse one at the cost of the other
ISTAR - S - Search Strategy & Store (cont) Yeild, Precision & Recall (Cont)
Our goal is to get as many relevant documents as we can for each serach (fewest irrelevant) AND to get all the relevant documents that exist.
So how do we influence PRECISION to get the best results?
Use serach operators like AND OR QUOTES, especially those where we want the relationship between the 2 - e.g. happy and hour.
Tru to avoid words with double meanings, be as specific as possible.
How do we influence RECALL?
Use variations in spelling, use synonyms, use general terms
ISTAR - S - Search Strategy & Store (cont) Considering WHERE to search
E.g
general search engines on clear web (google, bing etc),
Meta serach engines or combined tools (to serach multiple search engines at once)
Specialised search engines, databases, portals
Social media
Deep web
Multimedia (google earth, other maps, images, youtube etc)
Other sources like Usenet, IRC, P2P Torrent
ISTAR - S - Search Strategy & Store (cont) - STORE
Key considerations.
Internet is dynamic not static.
Need to preserve the dynamic content at that time by downloading the web content accurately to include all data, so it can be accessed offline.
Keep notes, urls, times / dates, screen dumps
ISTAR - T - Technical Capabilities & Tactical Applications
ISTAR - A - Analyse
ISTAR - R - REFINE, RECYCLE & REPORT