Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition when running two elixir scripts in the same unix pipeline #13326

Closed
kidq330 opened this issue Feb 9, 2024 · 8 comments
Closed

Comments

@kidq330
Copy link

kidq330 commented Feb 9, 2024

Elixir and Erlang/OTP versions

root@13f3f753ab50:/workspace/membrane/mix_cache_mre:λ> elixir -v
Erlang/OTP 26 [erts-14.2.1] [source] [64-bit] [smp:5:5] [ds:5:5:10] [async-threads:1] [jit]

Elixir 1.15.7 (compiled with Erlang/OTP 24)

Operating system

Debian 12 container

Current behavior

Scenario: I have two separate .exs scripts:

concat1.exs:

Mix.install([:bunch], force: true, verbose: true, system_env: %{ID: <<1>>})
(IO.read(:eof) <> "\nAdding a string\n") |> IO.puts()

concat2.exs:

Mix.install([:bunch], force: true, verbose: true, system_env: %{ID: <<2>>})
(IO.read(:eof) <> "\nAdding some other string\n") |> IO.puts()

I would like to run both inside the same command, like so:

root@13f3f753ab50:/workspace/membrane/mix_cache_mre:λ> echo hello | elixir concat1.exs | elixir concat2.exs 
Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/74a5dcec56f0af74a6e975a852595e45
Resolving Hex dependencies...
Resolution completed in 0.007s
New:
  bunch 1.6.1
* Getting bunch (Hex package)
==> bunch
Compiling 19 files (.ex)
Generated bunch app
Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/1e09fc90d768c108cbadc8d6e2258f56
Resolving Hex dependencies...
Resolution completed in 0.007s
New:
  bunch 1.6.1
* Getting bunch (Hex package)
==> bunch
Compiling 19 files (.ex)
Generated bunch app
hello

Adding a string


Adding some other string

This works because they're using different caches, as you can see by the Mix.install/2 using ... lines.
Unfortunately the process of generating a hash to use for the cache is imo not strict enough, if I delete the system_env: option from both scripts (or just set it to the same value) they will generate the same hash, and might fight over the cache. Example below:

concat1.exs:

Mix.install([:bunch], force: true, verbose: true)
(IO.read(:eof) <> "\nAdding a string\n") |> IO.puts()

concat2.exs:

Mix.install([:bunch], force: true, verbose: true)
(IO.read(:eof) <> "\nAdding some other string\n") |> IO.puts()
root@13f3f753ab50:/workspace/membrane/mix_cache_mre:λ> echo hello | elixir concat1.exs | elixir concat2.exs 
Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df
** (File.Error) could not get current working directory nil: no such file or directory
    (elixir 1.15.7) lib/file.ex:1567: File.cwd!/0
    (elixir 1.15.7) lib/path.ex:47: Path.absname/1
    (mix 1.15.7) lib/mix/dep/loader.ex:361: Mix.Dep.Loader.mix_children/3
    (mix 1.15.7) lib/mix/dep/loader.ex:18: Mix.Dep.Loader.children/1
    (mix 1.15.7) lib/mix/dep/converger.ex:97: Mix.Dep.Converger.all/4
    (mix 1.15.7) lib/mix/dep/converger.ex:89: Mix.Dep.Converger.converge/4
    (mix 1.15.7) lib/mix/dep/fetcher.ex:16: Mix.Dep.Fetcher.all/3
    (mix 1.15.7) lib/mix/tasks/deps.get.ex:34: Mix.Tasks.Deps.Get.run/1
Failed to write log message to stdout, trying stderr

01:02:56.897 [info] Failed to write to standard out (:epipe)
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: broken pipe (epipe)

    (stdlib 5.2) io.erl:103: :io.put_chars(:standard_io, [[[], "Resolving Hex dependencies..."], 10])
    (hex 2.0.6) lib/hex/remote_converger.ex:54: Hex.RemoteConverger.converge/2
    (mix 1.15.7) lib/mix/dep/converger.ex:133: Mix.Dep.Converger.all/4
    (mix 1.15.7) lib/mix/dep/converger.ex:89: Mix.Dep.Converger.converge/4
    (mix 1.15.7) lib/mix/dep/fetcher.ex:16: Mix.Dep.Fetcher.all/3
    (mix 1.15.7) lib/mix/tasks/deps.get.ex:34: Mix.Tasks.Deps.Get.run/1
    (mix 1.15.7) lib/mix/task.ex:455: anonymous fn/3 in Mix.Task.run_task/5
    (mix 1.15.7) lib/mix.ex:881: anonymous fn/5 in Mix.install/2

While the problem is less prevalent without force: true, it may still happen if no cache yet exists:

root@13f3f753ab50:/workspace/membrane/mix_cache_mre:λ> while /bin/rm -rf /root/.cache/mix/installs/ && sleep 5 && echo hello | elixir concat1.exs
 | elixir concat2.exs ; do :; done
Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df
Resolving Hex dependencies...
Resolution completed in 0.011s
New:
  bunch 1.6.1
* Getting bunch (Hex package)
==> bunch
Compiling 19 files (.ex)
Generated bunch app
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "bunch", "1.6.1"}}, 120000)
    ** (EXIT) time out
    (elixir 1.15.7) lib/gen_server.ex:1074: GenServer.call/3
    (hex 2.0.6) lib/hex/scm.ex:128: Hex.SCM.update/1
    (mix 1.15.7) lib/mix/dep/fetcher.ex:61: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.15.7) lib/mix/dep/converger.ex:229: Mix.Dep.Converger.all/8
    (mix 1.15.7) lib/mix/dep/converger.ex:162: Mix.Dep.Converger.init_all/8
    (mix 1.15.7) lib/mix/dep/converger.ex:146: Mix.Dep.Converger.all/4
    (mix 1.15.7) lib/mix/dep/converger.ex:89: Mix.Dep.Converger.converge/4
    (mix 1.15.7) lib/mix/dep/fetcher.ex:16: Mix.Dep.Fetcher.all/3
Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df
Resolving Hex dependencies...
Resolution completed in 0.01s
New:
  bunch 1.6.1
* Updating bunch (Hex package)

Adding some other string

Mix.install/2 using /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df
Resolving Hex dependencies...
Resolution completed in 0.01s
New:
  bunch 1.6.1
* Getting bunch (Hex package)
==> bunch
Compiling 19 files (.ex)
Generated bunch app
warning: redefining module Bunch.Struct (current version loaded from /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/bunch/ebin/Elixir.Bunch.Struct.beam)
  lib/bunch/struct.ex:1: Bunch.Struct (module)

warning: redefining module Bunch.ShortRef (current version loaded from /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/bunch/ebin/Elixir.Bunch.ShortRef.beam)
  lib/bunch/short_ref.ex:1: Bunch.ShortRef (module)

warning: redefining module Bunch.Typespec (current version loaded from /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/bunch/ebin/Elixir.Bunch.Typespec.beam)
  lib/bunch/typespec.ex:1: Bunch.Typespec (module)

warning: redefining module Bunch.Type (current version loaded from /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/bunch/ebin/Elixir.Bunch.Type.beam)
  lib/bunch/type.ex:1: Bunch.Type (module)

warning: redefining module Inspect.Bunch.ShortRef (current version loaded from /root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/bunch/ebin/Elixir.Inspect.Bunch.ShortRef.beam)
  lib/bunch/short_ref.ex:49: Inspect.Bunch.ShortRef (module)

** (EXIT from #PID<0.99.0>) an exception was raised:
    ** (File.Error) could not write to file "/root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/mix_install/consolidated/Elixir.Collectable.beam": no such file or directory
        (elixir 1.15.7) lib/file.ex:1117: File.write!/3
        (mix 1.15.7) lib/mix/tasks/compile.protocols.ex:142: Mix.Tasks.Compile.Protocols.consolidate/4
        (elixir 1.15.7) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
        (elixir 1.15.7) lib/task/supervised.ex:36: Task.Supervised.reply/4


02:06:40.177 [error] Task #PID<0.316.0> started from #PID<0.99.0> terminating
** (File.Error) could not write to file "/root/.cache/mix/installs/elixir-1.15.7-erts-14.2.1/f56c8600b5d135b8a055d35801b604df/_build/dev/lib/mix_install/consolidated/Elixir.Collectable.beam": no such file or directory
    (elixir 1.15.7) lib/file.ex:1117: File.write!/3
    (mix 1.15.7) lib/mix/tasks/compile.protocols.ex:142: Mix.Tasks.Compile.Protocols.consolidate/4
    (elixir 1.15.7) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.15.7) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<9.61781368/0 in Mix.Tasks.Compile.Protocols.consolidate/6>
    Args: []
Failed to write log message to stdout, trying stderr

02:06:40.272 [info] Failed to write to standard out (:epipe)
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: broken pipe (epipe)

    (stdlib 5.2) io.erl:103: :io.put_chars(:standard_io, ["hello\n\nAdding a string\n", 10])
    concat1.exs:2: (file)

Expected behavior

The pipeline works as in the case where the scripts have unique system_env's, (that is, separate cache directories).

@josevalim
Copy link
Member

Good find. We need file system locking to solve this, which is not yet available in Erlang/OTP. So we will be able to solve this only in the future.

@lukaszsamson
Copy link
Contributor

I run into a similar issue in ElixirLS install script. In my case the race happens when multiple editor windows are started, each one running its own LSP process. I haven't solved that yet.

As a side note piping into Mix.install doesn't work very well. Mix is quite chatty and pollutes stdout with app banners and install logs. I also noticed it sometimes consumes stdin.

@lukaszsamson
Copy link
Contributor

In your case @kidp330 you can workaround the issue with manual flock or lockfile in your shell script.

kidq330 added a commit to membraneframework/membrane_file_plugin that referenced this issue Feb 14, 2024
kidq330 added a commit to membraneframework/membrane_file_plugin that referenced this issue Feb 15, 2024
* Bump to 0.17.0

* Workaround for elixir-lang/elixir#13326
@shahryarjb
Copy link
Contributor

shahryarjb commented Apr 3, 2024

Hello, sometimes in developer mode I get this error in my iex and I can not debug this. it is returned automatically, and breaks my iex session

I am using Erlang/OTP 26 and 1.16.2 , mnesia, :phoenix, "~> 1.7.11"
Intel macOs and asdf installed. it is a simple elixir project not an script

these are same error? @josevalim

iex(44)> َFailed to write log message to stdout, trying stderr
[error] GenServer #PID<0.64.0> terminating
** (MatchError) no match of right hand side value: :error
    (kernel 9.2.2) prim_tty.erl:683: :prim_tty.handle_request/2
    (kernel 9.2.2) user_drv.erl:807: :user_drv.io_request/2
    (kernel 9.2.2) user_drv.erl:828: :user_drv.io_requests/2
    (kernel 9.2.2) user_drv.erl:775: :user_drv.io_request/2
    (kernel 9.2.2) user_drv.erl:891: :user_drv.handle_req/3
    (kernel 9.2.2) user_drv.erl:484: :user_drv.server/3
    (stdlib 5.2.1) gen_statem.erl:1395: :gen_statem.loop_state_callback/11
    (stdlib 5.2.1) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.69.0>, {{:badmatch, :error}, [{:prim_tty, :handle_request, 2, [file: ~c"prim_tty.erl", line: 683]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 807]}, {:user_drv, :io_requests, 2, [file: ~c"user_drv.erl", line: 828]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 775]}, {:user_drv, :handle_req, 3, [file: ~c"user_drv.erl", line: 891]}, {:user_drv, :server, 3, [file: ~c"user_drv.erl", line: 484]}, {:gen_statem, :loop_state_callback, 11, [file: ~c"gen_statem.erl", line: 1395]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}}
State: {:state, :user_sup, :undefined, #PID<0.69.0>, {#PID<0.64.0>, :user_sup}}

@josevalim
Copy link
Member

It seems to be something different. What is your OS and the first three lines of iex?

@shahryarjb
Copy link
Contributor

shahryarjb commented Apr 3, 2024

It seems to be something different. What is your OS and the first three lines of iex?

@josevalim I am very sorry, I cleaned my terminal, and I did not get this error again:

My os: Intel macOs V14.3.1

Erlang/OTP 26 [erts-14.2.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]

Elixir 1.16.2 (compiled with Erlang/OTP 26)

but I remember
First I created 2500 users in mnesia

this code:

def create_user() do
    data = fn ->
      rand = randstring(10)

      %{
        "full_name" => "user #{generate(20) |> String.downcase()}",
        "username" => generate(20),
        "email" => "shahryar_#{rand}@gmail.com",
        "scopes" => ["open_id", "email", "mobile"],
        "devices" => ["ios", "mac"],
        "status" => 4,
        "roles" => [
          %{"name" => "admin", "permissions" => ["admin:read", "blog:edit", "photo:read"]},
          %{"name" => "user", "permissions" => ["blog:read", "photo:creat"]}
        ]
      }
    end

    {time, _value} =
      :timer.tc(fn ->
        Enum.map(1..2500, fn x ->
          case Users.create(data.()) do
            {:ok, _struct} -> IO.inspect("User #{x} is created!!")
            error -> IO.inspect(error, label: "User #{x} is not created!!")
          end
        end)
      end)

    time
  end

after That I use this 2 or 3 times

:mnesia.ets(fn -> :mnesia.all_keys(__MODULE__) end) |> Enum.max()
:mnesia.ets(fn -> :mnesia.all_keys(__MODULE__) end) |> Enum.min()
:mnesia.ets(fn -> :mnesia.all_keys(__MODULE__) end) |> List.first()
:mnesia.ets(fn -> :mnesia.all_keys(__MODULE__) end) |> List.last()

After 1 min it returned!! all process I did, returned success outputs.

Thank you in advance

@shahryarjb
Copy link
Contributor

shahryarjb commented Jun 10, 2024

@josevalim Finally I can figure out why it creates this error.

When I write Persian in terminal and push the shift and a character after that delete it with backspace it return this error.

Please see this Video

111.mp4

Update in Elixir 1.17.0 (compiled with Erlang/OTP 27)

iex(1)> ٌFailed to write log message to stdout, trying stderr

09:47:43.528 [error] :gen_statem :user_drv terminating
** (MatchError) no match of right hand side value: :error
    (kernel 10.0) prim_tty.erl:772: :prim_tty.handle_request/2
    (kernel 10.0) user_drv.erl:837: :user_drv.io_request/2
    (kernel 10.0) user_drv.erl:858: :user_drv.io_requests/2
    (kernel 10.0) user_drv.erl:805: :user_drv.io_request/2
    (kernel 10.0) user_drv.erl:921: :user_drv.handle_req/3
    (kernel 10.0) user_drv.erl:488: :user_drv.server/3
    (stdlib 6.0) gen_statem.erl:3113: :gen_statem.loop_state_callback/11
    (stdlib 6.0) proc_lib.erl:329: :proc_lib.init_p_do_apply/3
Queue: [info: {#PID<0.70.0>, {:requests, [{:delete_chars, -1}, {:redraw_prompt, ~c"iex(1)> ", ~c"     .. ", {[], {[], []}, []}}]}}]
Postponed: []
State: :server
Data: {:state, {:state, #Reference<0.3742164222.3842637830.70993>, {#PID<0.68.0>, #Reference<0.3742164222.3842506758.71002>}, {#PID<0.67.0>, #Reference<0.3742164222.3842506758.71000>}, %{input: true, echo: false, tty: true, canon: false}, true, [], [], [[32, 1612], 62, 41, 49, 40, 120, 101, 105], [], :undefined, 1, 0, 167, 26, true, "\e[H\e[2J", "\e[A", "\n", "\b", "\e[C", "\t", "\e[J", "\e[%p1%d@", "\e[%p1%dP", "\e[6n", "\e\\[([0-9]+);([0-9]+)R", {:re_pattern, 0, 1, 0, <<69, 82, 67, 80, 94, 1, 0, 0, 16, 8, 0, 0, 1, 0, 0, ...>>}}, #Reference<0.3742164222.3842506758.71000>, #Reference<0.3742164222.3842506758.71002>, :new, :undefined, #PID<0.69.0>, #PID<0.70.0>, {:gr, 2, 1, #PID<0.70.0>, [{:group, 0, #PID<0.69.0>, {}}, {:group, 1, #PID<0.70.0>, {:iex, :start, [[dot_iex_path: nil, on_eof: :halt], {:elixir, :start_cli, []}]}}]}, {false, {[], []}}}
Callback mode: :state_functions, state_enter: false
Failed to write log message to stdout, trying stderr

09:47:43.535 [error] GenServer #PID<0.64.0> terminating
** (MatchError) no match of right hand side value: :error
    (kernel 10.0) prim_tty.erl:772: :prim_tty.handle_request/2
    (kernel 10.0) user_drv.erl:837: :user_drv.io_request/2
    (kernel 10.0) user_drv.erl:858: :user_drv.io_requests/2
    (kernel 10.0) user_drv.erl:805: :user_drv.io_request/2
    (kernel 10.0) user_drv.erl:921: :user_drv.handle_req/3
    (kernel 10.0) user_drv.erl:488: :user_drv.server/3
    (stdlib 6.0) gen_statem.erl:3113: :gen_statem.loop_state_callback/11
    (stdlib 6.0) proc_lib.erl:329: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.69.0>, {{:badmatch, :error}, [{:prim_tty, :handle_request, 2, [file: ~c"prim_tty.erl", line: 772]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 837]}, {:user_drv, :io_requests, 2, [file: ~c"user_drv.erl", line: 858]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 805]}, {:user_drv, :handle_req, 3, [file: ~c"user_drv.erl", line: 921]}, {:user_drv, :server, 3, [file: ~c"user_drv.erl", line: 488]}, {:gen_statem, :loop_state_callback, 11, [file: ~c"gen_statem.erl", line: 3113]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 329]}]}}
State: {:state, :user_sup, :undefined, #PID<0.69.0>, {#PID<0.64.0>, :user_sup}}

@josevalim
Copy link
Member

This has been solved in v1.18 as it introduces a OS lock. You can see it here:

$ echo hello | elixir concat1.exs | elixir concat2.exs
Mix.install/2 using /Users/jose/Library/Caches/mix/installs/elixir-1.18.0-dev-erts-15.0/f56c8600b5d135b8a055d35801b604df
Resolving Hex dependencies...
Resolution completed in 0.009s
New:
  bunch 1.6.1
* Getting bunch (Hex package)
==> bunch
Compiling 19 files (.ex)
Generated bunch app
Mix.install/2 using /Users/jose/Library/Caches/mix/installs/elixir-1.18.0-dev-erts-15.0/f56c8600b5d135b8a055d35801b604df
Waiting for lock on the deps directory (held by process 40346)
Resolving Hex dependencies...
Resolution completed in 0.008s
Unchanged:
  bunch 1.6.1
All dependencies are up to date
hello

Adding a string


Adding some other string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants