Skip to content

Commit 50e0ca8

Browse files
authored
Merge pull request #20622 from owen-mc/docs/fix-dataflow-examples
Docs: add path query example to data flow docs
2 parents a0d2005 + c8c1c6e commit 50e0ca8

9 files changed

+388
-10
lines changed

docs/codeql/codeql-language-guides/analyzing-data-flow-in-cpp.rst

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ Exercise 2: Write a query that finds all hard-coded strings used to create a ``h
314314

315315
Exercise 3: Write a class that represents flow sources from ``getenv``. (`Answer <#exercise-3>`__)
316316

317-
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``getenv`` to ``gethostbyname``. (`Answer <#exercise-4>`__)
317+
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``getenv`` to ``gethostbyname``. (`Answer <#exercise-4>`__ `Answer as a path query <#path-query-example>`__)
318318

319319
Answers
320320
-------
@@ -411,6 +411,48 @@ Exercise 4
411411
GetenvToGethostbynameFlow::flow(source, sink)
412412
select getenv, fc
413413
414+
Path query example
415+
~~~~~~~~~~~~~~~~~~
416+
417+
Here is the answer to exercise 4 above, converted into a path query:
418+
419+
.. code-block:: ql
420+
421+
/**
422+
* @kind path-problem
423+
* @problem.severity warning
424+
* @id getenv-to-gethostbyname
425+
*/
426+
427+
import cpp
428+
import semmle.code.cpp.dataflow.new.DataFlow
429+
430+
class GetenvSource extends DataFlow::Node {
431+
GetenvSource() { this.asIndirectExpr(1).(FunctionCall).getTarget().hasGlobalName("getenv") }
432+
}
433+
434+
module GetenvToGethostbynameConfiguration implements DataFlow::ConfigSig {
435+
predicate isSource(DataFlow::Node source) { source instanceof GetenvSource }
436+
437+
predicate isSink(DataFlow::Node sink) {
438+
exists(FunctionCall fc |
439+
sink.asIndirectExpr(1) = fc.getArgument(0) and
440+
fc.getTarget().hasName("gethostbyname")
441+
)
442+
}
443+
}
444+
445+
module GetenvToGethostbynameFlow = DataFlow::Global<GetenvToGethostbynameConfiguration>;
446+
447+
import GetenvToGethostbynameFlow::PathGraph
448+
449+
from GetenvToGethostbynameFlow::PathNode source, GetenvToGethostbynameFlow::PathNode sink
450+
where GetenvToGethostbynameFlow::flowPath(source, sink)
451+
select sink.getNode(), source, sink, "This file access uses data from $@.",
452+
source, "user-controllable input."
453+
454+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
455+
414456
Further reading
415457
---------------
416458

docs/codeql/codeql-language-guides/analyzing-data-flow-in-csharp.rst

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -287,7 +287,7 @@ Exercise 2: Find all hard-coded strings passed to ``System.Uri``, using global d
287287

288288
Exercise 3: Define a class that represents flow sources from ``System.Environment.GetEnvironmentVariable``. (`Answer <#exercise-3>`__)
289289

290-
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``System.Environment.GetEnvironmentVariable`` to ``System.Uri``. (`Answer <#exercise-4>`__)
290+
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``System.Environment.GetEnvironmentVariable`` to ``System.Uri``. (`Answer <#exercise-4>`__ `Answer as a path query <#path-query-example>`__)
291291

292292
Extending library data flow
293293
---------------------------
@@ -537,6 +537,48 @@ This can be adapted from the ``SystemUriFlow`` class:
537537
}
538538
}
539539
540+
Path query example
541+
~~~~~~~~~~~~~~~~~~
542+
543+
Here is the answer to exercise 4 above, converted into a path query:
544+
545+
.. code-block:: ql
546+
547+
/**
548+
* @kind path-problem
549+
* @problem.severity warning
550+
* @id getenv-to-gethostbyname
551+
*/
552+
553+
import csharp
554+
555+
class EnvironmentVariableFlowSource extends DataFlow::ExprNode {
556+
EnvironmentVariableFlowSource() {
557+
this.getExpr().(MethodCall).getTarget().hasQualifiedName("System.Environment.GetEnvironmentVariable")
558+
}
559+
}
560+
561+
module EnvironmentToUriConfig implements DataFlow::ConfigSig {
562+
predicate isSource(DataFlow::Node src) {
563+
src instanceof EnvironmentVariableFlowSource
564+
}
565+
566+
predicate isSink(DataFlow::Node sink) {
567+
exists(Call c | c.getTarget().(Constructor).getDeclaringType().hasQualifiedName("System.Uri")
568+
and sink.asExpr()=c.getArgument(0))
569+
}
570+
}
571+
572+
module EnvironmentToUriFlow = DataFlow::Global<EnvironmentToUriConfig>;
573+
574+
import EnvironmentToUriFlow::PathGraph
575+
576+
from EnvironmentToUriFlow::PathNode src, EnvironmentToUriFlow::PathNode sink
577+
where EnvironmentToUriFlow::flowPath(src, sink)
578+
select src.getNode(), src, sink, "This environment variable constructs a 'System.Uri' $@.", sink, "here"
579+
580+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
581+
540582
Further reading
541583
---------------
542584

docs/codeql/codeql-language-guides/analyzing-data-flow-in-go.rst

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ The resulting module has an identical signature to the one obtained from ``DataF
224224
Flow sources
225225
~~~~~~~~~~~~
226226

227-
The data flow library contains some predefined flow sources. The class ``RemoteFlowSource`` (defined in ``semmle.code.java.dataflow.FlowSources``) represents data flow sources that may be controlled by a remote user, which is useful for finding security problems.
227+
The data flow library contains some predefined flow sources. The class ``RemoteFlowSource`` represents data flow sources that may be controlled by a remote user, which is useful for finding security problems.
228228

229229
Examples
230230
~~~~~~~~
@@ -252,7 +252,7 @@ Exercise 2: Write a query that finds all hard-coded strings used to create a ``u
252252

253253
Exercise 3: Write a class that represents flow sources from ``os.Getenv(..)``. (`Answer <#exercise-3>`__)
254254

255-
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``os.Getenv`` to ``url.URL``. (`Answer <#exercise-4>`__)
255+
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``os.Getenv`` to ``url.URL``. (`Answer <#exercise-4>`__ `Answer as a path query <#path-query-example>`__)
256256

257257
Answers
258258
-------
@@ -312,7 +312,7 @@ Exercise 3
312312
313313
import go
314314
315-
class GetenvSource extends CallExpr {
315+
class GetenvSource extends DataFlow::CallNode {
316316
GetenvSource() {
317317
exists(Function m | m = this.getTarget() |
318318
m.hasQualifiedName("os", "Getenv")
@@ -327,7 +327,7 @@ Exercise 4
327327
328328
import go
329329
330-
class GetenvSource extends CallExpr {
330+
class GetenvSource extends DataFlow::CallNode {
331331
GetenvSource() {
332332
exists(Function m | m = this.getTarget() |
333333
m.hasQualifiedName("os", "Getenv")
@@ -350,7 +350,6 @@ Exercise 4
350350
sink.asExpr() = call.getArgument(0)
351351
)
352352
}
353-
}
354353
}
355354
356355
module GetenvToURLFlow = DataFlow::Global<GetenvToURLConfig>;
@@ -359,6 +358,56 @@ Exercise 4
359358
where GetenvToURLFlow::flow(src, sink)
360359
select src, "This environment variable constructs a URL $@.", sink, "here"
361360
361+
Path query example
362+
~~~~~~~~~~~~~~~~~~
363+
364+
Here is the answer to exercise 4 above, converted into a path query:
365+
366+
.. code-block:: ql
367+
368+
/**
369+
* @kind path-problem
370+
* @problem.severity warning
371+
* @id getenv-to-url
372+
*/
373+
374+
import go
375+
376+
class GetenvSource extends DataFlow::CallNode {
377+
GetenvSource() {
378+
exists(Function m | m = this.getTarget() |
379+
m.hasQualifiedName("os", "Getenv")
380+
)
381+
}
382+
}
383+
384+
module GetenvToURLConfig implements DataFlow::ConfigSig {
385+
predicate isSource(DataFlow::Node source) {
386+
source instanceof GetenvSource
387+
}
388+
389+
predicate isSink(DataFlow::Node sink) {
390+
exists(Function urlParse, CallExpr call |
391+
(
392+
urlParse.hasQualifiedName("url", "Parse") or
393+
urlParse.hasQualifiedName("url", "ParseRequestURI")
394+
) and
395+
call.getTarget() = urlParse and
396+
sink.asExpr() = call.getArgument(0)
397+
)
398+
}
399+
}
400+
401+
module GetenvToURLFlow = DataFlow::Global<GetenvToURLConfig>;
402+
403+
import GetenvToURLFlow::PathGraph
404+
405+
from GetenvToURLFlow::PathNode src, GetenvToURLFlow::PathNode sink
406+
where GetenvToURLFlow::flowPath(src, sink)
407+
select src.getNode(), src, sink, "This environment variable constructs a URL $@.", sink, "here"
408+
409+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
410+
362411
Further reading
363412
---------------
364413

docs/codeql/codeql-language-guides/analyzing-data-flow-in-java.rst

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,7 @@ Exercise 2: Write a query that finds all hard-coded strings used to create a ``j
262262

263263
Exercise 3: Write a class that represents flow sources from ``java.lang.System.getenv(..)``. (`Answer <#exercise-3>`__)
264264

265-
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``getenv`` to ``java.net.URL``. (`Answer <#exercise-4>`__)
265+
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from ``getenv`` to ``java.net.URL``. (`Answer <#exercise-4>`__ `Answer as a path query <#path-query-example>`__)
266266

267267
Answers
268268
-------
@@ -361,6 +361,54 @@ Exercise 4
361361
where GetenvToURLFlow::flow(src, sink)
362362
select src, "This environment variable constructs a URL $@.", sink, "here"
363363
364+
Path query example
365+
~~~~~~~~~~~~~~~~~~
366+
367+
Here is the answer to exercise 4 above, converted into a path query:
368+
369+
.. code-block:: ql
370+
371+
/**
372+
* @kind path-problem
373+
* @problem.severity warning
374+
* @id getenv-to-url
375+
*/
376+
377+
import java
378+
import semmle.code.java.dataflow.DataFlow
379+
380+
class GetenvSource extends DataFlow::ExprNode {
381+
GetenvSource() {
382+
exists(Method m | m = this.asExpr().(MethodCall).getMethod() |
383+
m.hasName("getenv") and
384+
m.getDeclaringType() instanceof TypeSystem
385+
)
386+
}
387+
}
388+
389+
module GetenvToURLConfig implements DataFlow::ConfigSig {
390+
predicate isSource(DataFlow::Node source) {
391+
source instanceof GetenvSource
392+
}
393+
394+
predicate isSink(DataFlow::Node sink) {
395+
exists(Call call |
396+
sink.asExpr() = call.getArgument(0) and
397+
call.getCallee().(Constructor).getDeclaringType().hasQualifiedName("java.net", "URL")
398+
)
399+
}
400+
}
401+
402+
module GetenvToURLFlow = DataFlow::Global<GetenvToURLConfig>;
403+
404+
import GetenvToURLFlow::PathGraph
405+
406+
from GetenvToURLFlow::PathNode src, GetenvToURLFlow::PathNode sink
407+
where GetenvToURLFlow::flowPath(src, sink)
408+
select src.getNode(), src, sink, "This environment variable constructs a URL $@.", sink, "here"
409+
410+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
411+
364412
Further reading
365413
---------------
366414

docs/codeql/codeql-language-guides/analyzing-data-flow-in-javascript-and-typescript.rst

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ Exercise 3: Write a class which represents flow sources from the array elements
456456
Hint: array indices are properties with numeric names; you can use regular expression matching to check this. (`Answer <#exercise-3>`__)
457457
458458
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flow paths from array elements of the result of a call to the ``tagName`` argument to the
459-
``createElement`` function. (`Answer <#exercise-4>`__)
459+
``createElement`` function. (`Answer <#exercise-4>`__ `Answer as a path query <#path-query-example>`__)
460460
461461
Answers
462462
-------
@@ -541,6 +541,48 @@ Exercise 4
541541
where HardCodedTagNameFlow::flow(source, sink)
542542
select source, sink
543543
544+
Path query example
545+
~~~~~~~~~~~~~~~~~~
546+
547+
Here is the answer to exercise 4 above, converted into a path query:
548+
549+
.. code-block:: ql
550+
551+
/**
552+
* @kind path-problem
553+
* @problem.severity warning
554+
* @id hard-coded-tag-name
555+
*/
556+
557+
import javascript
558+
559+
class ArrayEntryCallResult extends DataFlow::Node {
560+
ArrayEntryCallResult() {
561+
exists(DataFlow::CallNode call, string index |
562+
this = call.getAPropertyRead(index) and
563+
index.regexpMatch("\\d+")
564+
)
565+
}
566+
}
567+
568+
module HardCodedTagNameConfig implements DataFlow::ConfigSig {
569+
predicate isSource(DataFlow::Node source) { source instanceof ArrayEntryCallResult }
570+
571+
predicate isSink(DataFlow::Node sink) {
572+
sink = DataFlow::globalVarRef("document").getAMethodCall("createElement").getArgument(0)
573+
}
574+
}
575+
576+
module HardCodedTagNameFlow = DataFlow::Global<HardCodedTagNameConfig>;
577+
578+
import HardCodedTagNameFlow::PathGraph
579+
580+
from HardCodedTagNameFlow::PathNode source, HardCodedTagNameFlow::PathNode sink
581+
where HardCodedTagNameFlow::flowPath(source, sink)
582+
select sink.getNode(), source, sink, "Hard-coded tag name $@.", source, "here"
583+
584+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
585+
544586
Further reading
545587
---------------
546588

docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,11 +354,50 @@ This data flow configuration tracks data flow from environment variables to open
354354
select fileOpen, "This call to 'os.open' uses data from $@.",
355355
environment, "call to 'os.getenv'"
356356
357+
Path query example
358+
~~~~~~~~~~~~~~~~~~
359+
360+
Here is the network input example above, converted into a path query:
361+
362+
.. code-block:: ql
363+
364+
/**
365+
* @kind path-problem
366+
* @problem.severity warning
367+
* @id file-system-access-from-remote-input
368+
*/
369+
370+
import python
371+
import semmle.python.dataflow.new.DataFlow
372+
import semmle.python.dataflow.new.TaintTracking
373+
import semmle.python.dataflow.new.RemoteFlowSources
374+
import semmle.python.Concepts
375+
376+
module RemoteToFileConfiguration implements DataFlow::ConfigSig {
377+
predicate isSource(DataFlow::Node source) {
378+
source instanceof RemoteFlowSource
379+
}
380+
381+
predicate isSink(DataFlow::Node sink) {
382+
sink = any(FileSystemAccess fa).getAPathArgument()
383+
}
384+
}
385+
386+
module RemoteToFileFlow = TaintTracking::Global<RemoteToFileConfiguration>;
387+
388+
import RemoteToFileFlow::PathGraph
389+
390+
from RemoteToFileFlow::PathNode input, RemoteToFileFlow::PathNode fileAccess
391+
where RemoteToFileFlow::flowPath(input, fileAccess)
392+
select fileAccess.getNode(), input, fileAccess, "This file access uses data from $@.",
393+
input, "user-controllable input."
394+
395+
For more information, see "`Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__".
357396

358397
Further reading
359398
---------------
360399

361-
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
400+
- `Creating path queries <https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/>`__.
362401

363402

364403
.. include:: ../reusables/python-further-reading.rst

0 commit comments

Comments
 (0)