Skip to content

Commit d508d49

Browse files
committed
Add dir_entry::refresh and file type observers. Use them in recursive dir_it.
This commit changes behavior of directory_entry constructors and modifiers that change the stored path in v4: the methods will now automatically query the filesystem for the file status instead of leaving the cached data default-initialized. This means that the paths passed to directory_entry must be valid, otherwise an error will be returned. Filesystem querying is implemented in the new directory_entry::refresh methods. The constructors and modifiers that accepted file_status arguments are now removed in v4. The cached file statuses are an implementation detail, and eventually we may want to add more cached data, as we add more observers to directory_entry. Also added a few file type observers to directory_entry. These observers allow to avoid querying the filesystem if the full file status is not cached but the file type is (i.e. when permissions are not cached). This is the case with readdir-based implementation of directory_iterator, if the underlying C library supports dirent::d_type field. recursive_directory_iterator has been updated to use the added file type observers instead of querying the full status. This may improve performance of directory iteration. Closes #288.
1 parent 1aff314 commit d508d49

File tree

7 files changed

+456
-142
lines changed

7 files changed

+456
-142
lines changed

doc/reference.html

+123-54
Original file line numberDiff line numberDiff line change
@@ -2543,23 +2543,48 @@ <h2><a name="Class-directory_entry">Class <code>directory_entry</code></a> [clas
25432543
// <a href="#directory_entry-constructors">constructors</a> and destructor
25442544
directory_entry();
25452545
directory_entry(const directory_entry&amp;);
2546-
explicit directory_entry(const path&amp; p, file_status st=file_status(),
2546+
explicit directory_entry(const path&amp; p);
2547+
directory_entry(const path&amp; p, system::error_code&amp; ec); // v4-only
2548+
directory_entry(const path&amp; p, file_status st, // v3-only
25472549
file_status symlink_st=file_status());
25482550
~directory_entry();
25492551

25502552
// <a href="#directory_entry-modifiers">modifiers</a>
25512553
directory_entry&amp; operator=(const directory_entry&amp;);
2552-
void assign(const path&amp; p, file_status st=file_status(),
2554+
void assign(const path&amp; p);
2555+
void assign(const path&amp; p, system::error_code&amp; ec); // v4-only
2556+
void assign(const path&amp; p, file_status st, // v3-only
25532557
file_status symlink_st=file_status());
2554-
void replace_filename(const path&amp; p, file_status st=file_status(),
2558+
void replace_filename(const path&amp; p);
2559+
void replace_filename(const path&amp; p, system::error_code&amp; ec); // v4-only
2560+
void replace_filename(const path&amp; p, file_status st, // v3-only
25552561
file_status symlink_st=file_status());
25562562

2563+
void refresh();
2564+
void refresh(system::error_code&amp; ec);
2565+
25572566
// <a href="#directory_entry-observers">observers</a>
25582567
const path&amp; path() const;
2568+
25592569
file_status status() const;
25602570
file_status status(system::error_code&amp; ec) const;
25612571
file_status symlink_status() const;
25622572
file_status symlink_status(system::error_code&amp; ec) const;
2573+
file_type file_type() const;
2574+
file_type file_type(system::error_code&amp; ec) const;
2575+
file_type symlink_file_type() const;
2576+
file_type symlink_file_type(system::error_code&amp; ec) const;
2577+
2578+
bool exists() const;
2579+
bool exists(system::error_code&amp; ec) const;
2580+
bool is_regular_file() const;
2581+
bool is_regular_file(system::error_code&amp; ec) const;
2582+
bool is_directory() const;
2583+
bool is_directory(system::error_code&amp; ec) const;
2584+
bool is_symlink() const;
2585+
bool is_symlink(system::error_code&amp; ec) const;
2586+
bool is_other() const;
2587+
bool is_other(system::error_code&amp; ec) const;
25632588

25642589
bool operator&lt; (const directory_entry&amp; rhs);
25652590
bool operator==(const directory_entry&amp; rhs);
@@ -2577,10 +2602,12 @@ <h2><a name="Class-directory_entry">Class <code>directory_entry</code></a> [clas
25772602
} // namespace filesystem
25782603
} // namespace boost</pre>
25792604

2580-
<p>A <code>directory_entry</code> object stores a <code>path object</code>,
2581-
a <code>file_status</code> object for non-symbolic link status, and a <code>file_status</code> object for symbolic link status. The <code>file_status</code> objects act as value caches.</p>
2605+
<p>A <code>directory_entry</code> object stores a <code>path</code> object,
2606+
as well as some amount of cached information about the file identified by the path.
2607+
Currently, the cached information includes a <code>file_status</code> object for non-symbolic
2608+
link status and a <code>file_status</code> object for symbolic link status.</p>
25822609
<blockquote>
2583-
<p>[<i>Note:</i> Because <code>status()</code>on a pathname may be a relatively expensive operation,
2610+
<p>[<i>Note:</i> Because <code>status()</code> on a pathname may be a relatively expensive operation,
25842611
some operating systems provide status information as a byproduct of directory
25852612
iteration. Caching such status information can result is significant time savings. Cached and
25862613
non-cached results may differ in the presence of file system races. <i>—end note</i>]</p>
@@ -2589,8 +2616,14 @@ <h2><a name="Class-directory_entry">Class <code>directory_entry</code></a> [clas
25892616
versus one second for cached status queries. Windows XP, 3.0 GHz processor, with
25902617
a moderately fast hard-drive. Similar speedups are expected on Linux and BSD-derived
25912618
systems that provide status as a by-product of directory iteration.</i></span></p>
2592-
</blockquote>
2593-
<h3> <a name="directory_entry-constructors"> <code>directory_entry </code>constructors</a>
2619+
<p>[<i>Note:</i> The exact set of cached information may vary from one Boost.Filesystem version
2620+
to another, and also between different operating systems and underlying file systems. Users' code
2621+
must not rely on whether a certain piece of information is cached or not. This means that calling
2622+
most observers and modifiers of <code>directory_entry</code> may or may not result in a filesystem
2623+
query that may potentially fail. Information caching is exclusively a performance feature aimed
2624+
at reducing the amount of such queries. <i>—end note</i>]</p>
2625+
</blockquote>
2626+
<h3> <a name="directory_entry-constructors"> <code>directory_entry</code> constructors</a>
25942627
[directory_entry.cons]</h3>
25952628
<pre>directory_entry();</pre>
25962629
<blockquote>
@@ -2614,9 +2647,18 @@ <h3> <a name="directory_entry-constructors"> <code>directory_entry </code>constr
26142647
</tr>
26152648
</table>
26162649
</blockquote>
2617-
<pre>explicit directory_entry(const path&amp; p, file_status st=file_status(), file_status symlink_st=file_status());</pre>
2650+
<pre>explicit directory_entry(const path&amp; p);
2651+
directory_entry(const path&amp; p, system::error_code&amp; ec); // v4-only</pre>
26182652
<blockquote>
2619-
<p><i>Postcondition:</i></p>
2653+
<p><i>Effects:</i></p>
2654+
<p><b>v3:</b> Initializes <code>m_path</code> from <code>p</code> and default-constructs <code>m_status</code> and <code>m_symlink_status</code>.</p>
2655+
<p>[<i>Note:</i> The cached file statuses will be updated when queried by the caller or by an explicit call to <code>refresh</code>. <i>—end note</i>]</p>
2656+
<p><b>v4:</b> Initializes <code>m_path</code> from <code>p</code> and calls <code>refresh()</code> or <code>refresh(ec)</code>, respectively.</p>
2657+
<p><i>Postcondition:</i> <code>path() == p</code> if no error occurs, otherwise <code>path().empty() == true</code>.</p>
2658+
</blockquote>
2659+
<pre>directory_entry(const path&amp; p, file_status st, file_status symlink_st=file_status()); // v3-only</pre>
2660+
<blockquote>
2661+
<p><b>v3:</b> <i>Postcondition:</i></p>
26202662
<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="36%">
26212663
<tr>
26222664
<td width="18%"><b>Expression</b></td>
@@ -2636,11 +2678,20 @@ <h3> <a name="directory_entry-constructors"> <code>directory_entry </code>constr
26362678
</tr>
26372679
</table>
26382680
</blockquote>
2639-
<h3> <a name="directory_entry-modifiers"> <code>directory_entry </code>modifiers</a>
2681+
<h3> <a name="directory_entry-modifiers"> <code>directory_entry</code> modifiers</a>
26402682
[directory_entry.mods]</h3>
2641-
<pre>void assign(const path&amp; p, file_status st=file_status(), file_status symlink_st=file_status());</pre>
2683+
<pre>void assign(const path&amp; p);
2684+
void assign(const path&amp; p, system::error_code&amp; ec); // v4-only</pre>
26422685
<blockquote>
2643-
<p><i>Postcondition:</i></p>
2686+
<p><i>Effects:</i></p>
2687+
<p><b>v3:</b> Assigns <code>p</code> to <code>m_path</code> and <code>file_status()</code> to <code>m_status</code> and <code>m_symlink_status</code>.</p>
2688+
<p>[<i>Note:</i> The cached file statuses will be updated when queried by the caller or by an explicit call to <code>refresh</code>. <i>—end note</i>]</p>
2689+
<p><b>v4:</b> Assigns <code>p</code> to <code>m_path</code> and calls <code>refresh()</code> or <code>refresh(ec)</code>, respectively. If an error
2690+
occurs, the value of the cached data is unspecified.</p>
2691+
</blockquote>
2692+
<pre>void assign(const path&amp; p, file_status st, file_status symlink_st=file_status()); // v3-only</pre>
2693+
<blockquote>
2694+
<p><b>v3:</b> <i>Postcondition:</i></p>
26442695
<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="36%">
26452696
<tr>
26462697
<td width="18%"><b>Expression</b></td>
@@ -2660,27 +2711,25 @@ <h3> <a name="directory_entry-modifiers"> <code>directory_entry </code>modifiers
26602711
</tr>
26612712
</table>
26622713
</blockquote>
2663-
<pre>void replace_filename(const path&amp; p, file_status st=file_status(), file_status symlink_st=file_status());</pre>
2714+
<pre>void replace_filename(const path&amp; p);
2715+
void replace_filename(const path&amp; p, system::error_code&amp; ec); // v4-only</pre>
26642716
<blockquote>
2665-
<p><i>Postcondition:</i></p>
2666-
<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="43%">
2667-
<tr>
2668-
<td width="18%"><b>Expression</b></td>
2669-
<td width="82%"><b>Value</b></td>
2670-
</tr>
2671-
<tr>
2672-
<td width="18%"><code>path()</code></td>
2673-
<td width="82%"><code>path().branch() / s</code></td>
2674-
</tr>
2675-
<tr>
2676-
<td width="18%"><code>status()</code></td>
2677-
<td width="82%"><code>st</code></td>
2678-
</tr>
2679-
<tr>
2680-
<td width="18%"><code>symlink_status()</code></td>
2681-
<td width="82%"><code>symlink_st</code></td>
2682-
</tr>
2683-
</table>
2717+
<p><i>Effects:</i></p>
2718+
<p><b>v3:</b> Calls <code>m_path.replace_filename(p)</code> and assigns <code>file_status()</code> to <code>m_status</code> and <code>m_symlink_status</code>.</p>
2719+
<p>[<i>Note:</i> The cached file statuses will be updated when queried by the caller or by an explicit call to <code>refresh</code>. <i>—end note</i>]</p>
2720+
<p><b>v4:</b> Calls <code>m_path.replace_filename(p)</code> and then <code>refresh()</code> or <code>refresh(ec)</code>, respectively. If an error
2721+
occurs, the value of the cached data is unspecified.</p>
2722+
</blockquote>
2723+
<pre>void replace_filename(const path&amp; p, file_status st, file_status symlink_st=file_status()); // v3-only</pre>
2724+
<blockquote>
2725+
<p><i>Effects:</i> <b>v3:</b> Calls <code>m_path.replace_filename(p)</code> and assigns <code>st</code> to <code>m_status</code> and <code>symlink_st</code>
2726+
to <code>m_symlink_status</code>.</p>
2727+
</blockquote>
2728+
<pre>void refresh();
2729+
void refresh(system::error_code&amp; ec);</pre>
2730+
<blockquote>
2731+
<p><i>Effects:</i> Updates any cached data by querying the filesystem about the file identified by <code>m_path</code>. If an error occurs,
2732+
the value of the cached data is unspecified.</p>
26842733
</blockquote>
26852734
<h3> <a name="directory_entry-observers"> <code>directory_entry</code> observers</a>
26862735
[directory_entry.obs]</h3>
@@ -2691,35 +2740,55 @@ <h3> <a name="directory_entry-observers"> <code>directory_entry</code> observers
26912740
<pre>file_status status() const;
26922741
file_status status(system::error_code&amp; ec) const;</pre>
26932742
<blockquote>
2694-
<p><i>Effects:</i> As if,</p>
2695-
<blockquote>
2696-
<pre>if ( !status_known( m_status ) )
2697-
{
2698-
if ( status_known(m_symlink_status) &amp;&amp; !is_symlink(m_symlink_status) )
2699-
{ m_status = m_symlink_status; }
2700-
else { m_status = status(m_path<i>[, ec]</i>); }
2701-
}</pre>
2702-
</blockquote>
2743+
<p><i>Effects:</i> If <code>!status_known(m_status)</code>, calls <code>refresh()</code> or <code>refresh(ec)</code>, respectively.</p>
27032744
<p><i>Returns:</i> <code>m_status</code></p>
2704-
27052745
<p><i>Throws:</i> As specified in <a href="#Error-reporting">Error reporting</a>.</p>
2706-
27072746
</blockquote>
27082747
<pre>file_status symlink_status() const;
27092748
file_status symlink_status(system::error_code&amp; ec) const;</pre>
27102749
<blockquote>
2711-
<p>
2712-
<i>Effects:</i> As if,</p>
2713-
<blockquote>
2714-
<pre>if ( !status_known( m_symlink_status ) )
2715-
{
2716-
m_symlink_status = symlink_status(m_path<i>[, ec]</i>);
2717-
}</pre>
2718-
</blockquote>
2750+
<p><i>Effects:</i> If <code>!status_known(m_symlink_status)</code>, calls <code>refresh()</code> or <code>refresh(ec)</code>, respectively.</p>
27192751
<p><i>Returns:</i> <code>m_symlink_status</code></p>
2720-
27212752
<p><i>Throws:</i> As specified in <a href="#Error-reporting">Error reporting</a>.</p>
2722-
2753+
</blockquote>
2754+
<pre>file_type file_type() const;
2755+
file_type file_type(system::error_code&amp; ec) const;</pre>
2756+
<blockquote>
2757+
<p><i>Effects:</i> Equivalent to <code>status().type()</code> or <code>status(ec).type()</code>, respectively.</p>
2758+
<p>[<i>Note:</i> The implementation may be more efficient than calling <code>status</code>, if the information
2759+
about the file type is cached, but permissions are not. <i>—end note</i>]</p>
2760+
</blockquote>
2761+
<pre>file_type symlink_file_type() const;
2762+
file_type symlink_file_type(system::error_code&amp; ec) const;</pre>
2763+
<blockquote>
2764+
<p><i>Effects:</i> Equivalent to <code>symlink_status().type()</code> or <code>symlink_status(ec).type()</code>, respectively.</p>
2765+
<p>[<i>Note:</i> The implementation may be more efficient than calling <code>symlink_status</code>, if the information
2766+
about the file type is cached, but permissions are not. <i>—end note</i>]</p>
2767+
</blockquote>
2768+
<pre>bool exists() const;
2769+
bool exists(system::error_code&amp; ec) const;</pre>
2770+
<blockquote>
2771+
<p><i>Effects:</i> Equivalent to <code>exists(status())</code> or <code>exists(status(ec))</code>, respectively.</p>
2772+
</blockquote>
2773+
<pre>bool is_regular_file() const;
2774+
bool is_regular_file(system::error_code&amp; ec) const;</pre>
2775+
<blockquote>
2776+
<p><i>Effects:</i> Equivalent to <code>is_regular_file(status())</code> or <code>is_regular_file(status(ec))</code>, respectively.</p>
2777+
</blockquote>
2778+
<pre>bool is_directory() const;
2779+
bool is_directory(system::error_code&amp; ec) const;</pre>
2780+
<blockquote>
2781+
<p><i>Effects:</i> Equivalent to <code>is_directory(status())</code> or <code>is_directory(status(ec))</code>, respectively.</p>
2782+
</blockquote>
2783+
<pre>bool is_symlink() const;
2784+
bool is_symlink(system::error_code&amp; ec) const;</pre>
2785+
<blockquote>
2786+
<p><i>Effects:</i> Equivalent to <code>is_symlink(symlink_status())</code> or <code>is_symlink(symlink_status(ec))</code>, respectively.</p>
2787+
</blockquote>
2788+
<pre>bool is_other() const;
2789+
bool is_other(system::error_code&amp; ec) const;</pre>
2790+
<blockquote>
2791+
<p><i>Effects:</i> Equivalent to <code>is_other(status())</code> or <code>is_other(status(ec))</code>, respectively.</p>
27232792
</blockquote>
27242793
<pre>bool operator==(const directory_entry&amp; rhs);</pre>
27252794
<blockquote>

doc/release_history.html

+5
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,11 @@
4141

4242
<h2>1.83.0</h2>
4343
<ul>
44+
<li>Added <code>directory_entry::refresh</code> method that updates internal cached file statuses for the directory entry identified by path.</li>
45+
<li><b>v4:</b> <code>directory_entry</code> constructors and modifiers that initialize or modify the path now automatically call <code>refresh</code>. This may result in errors that were not indicated before and in <b>v3</b>, if querying the filesystem for file statuses fails (e.g. if the file does not exist). This new behavior is similar to std::filesystem.</li>
46+
<li><b>v4:</b> <code>directory_entry</code> constructors and methods taking <code>file_status</code> parameters are removed. Users are recommended to remove these arguments and rely on <code>directory_entry</code> calling <code>refresh</code> internally.</li>
47+
<li>Added <code>directory_entry</code> member methods for checking the file type of the file, similar to std::filesystem.</li>
48+
<li><code>recursive_directory_iterator</code> is now more likely to reuse information about the file type that is obtained during filesystem iteration. This may improve performance. (<a href="https://github.com/boostorg/filesystem/issues/288">#288</a>)</li>
4449
<li>File streams defined in <code>boost/filesystem/fstream.hpp</code> are now movable, if the standard library file streams are. (<a href="https://github.com/boostorg/filesystem/issues/280">#280</a>)</li>
4550
<li>Generic <code>path</code> comparison operators are now more restricted to avoid potential ambiguities when user's code contains a <code>using namespace boost::filesystem;</code> directive. (<a href="https://github.com/boostorg/filesystem/issues/285">#285</a>)</li>
4651
<li>Fixed potential overload resolution ambiguity in users' code, where <code>path</code> constructors from iterators could interfere with function overloads taking a <code>std::initializer_list</code> argument. (<a href="https://github.com/boostorg/filesystem/issues/287">#287</a>)</li>

doc/v4.html

+4-1
Original file line numberDiff line numberDiff line change
@@ -54,10 +54,13 @@ <h2>Breaking changes</h2>
5454
<li><a href="reference.html#path-appends"><code>path</code> appends</a> consider root name and root directory of the appended path. If the appended path is absolute, or root name is present and differs from the source path, the resulting path is equivalent to the appended path. If root directory is present, the result is the root directory and relative path rebased on top of the root name of the source path. Otherwise, the behavior is similar to v3. This behavior is similar to C++17 std::filesystem.</li>
5555
<li><code>path</code> no longer supports construction, assignment or appending from containers of characters. Use string types or iterators as the source for these opereations instead.</li>
5656
<li><a href="reference.html#path-remove_filename"><code>path::remove_filename</code></a> preserves the trailing directory separator, so that <code>path::has_filename</code> returns <code>false</code> after a successful call to <code>path::remove_filename</code>.</li>
57+
<li><code>directory_entry</code> constructors and modifiers that initialize or modify path of the directory entry automatically call <code>directory_entry::refresh</code> instead of clearing cached file statuses. This means that the file identified by the new path needs to be accessible in the filesystem at the point of the call.</li>
58+
<li><code>directory_entry</code> constructors and modifiers that accept <code>file_status</code> arguments to initialize cached file statuses are removed. The amount of cached data
59+
is an implementation detail of <code>directory_entry</code>, and in the future it may not be limited to just file statuses. Users should rely on automatic refreshes of the cached data.</li>
5760
</ul>
5861

5962
<hr>
60-
<p>&copy; Copyright Andrey Semashev, 2021-2022</p>
63+
<p>&copy; Copyright Andrey Semashev, 2021-2023</p>
6164
<p>Distributed under the Boost Software License, Version 1.0. See
6265
<a href="http://www.boost.org/LICENSE_1_0.txt">www.boost.org/LICENSE_1_0.txt</a></p>
6366

0 commit comments

Comments
 (0)