Browse Source

Added cover

master
Yigit Colakoglu 4 years ago
parent
commit
3f4154005a
23 changed files with 3086 additions and 36 deletions
  1. +14
    -13
      content/posts/supercharge-your-bash-scripts-with-multiprocessing.md
  2. +20
    -1
      public/about/index.html
  3. +20
    -1
      public/awards/index.html
  4. +42
    -1
      public/index.html
  5. +333
    -1
      public/index.xml
  6. +10
    -6
      public/posts/index.html
  7. +328
    -7
      public/posts/index.xml
  8. +542
    -0
      public/posts/supercharge-your-bash-scripts-with-multiprocessing/index.html
  9. +20
    -1
      public/projects/index.html
  10. +21
    -4
      public/sitemap.xml
  11. +216
    -0
      public/tags/bash/index.html
  12. +343
    -0
      public/tags/bash/index.xml
  13. +1
    -0
      public/tags/bash/page/1/index.html
  14. +24
    -0
      public/tags/index.html
  15. +32
    -1
      public/tags/index.xml
  16. +216
    -0
      public/tags/programming/index.html
  17. +343
    -0
      public/tags/programming/index.xml
  18. +1
    -0
      public/tags/programming/page/1/index.html
  19. +216
    -0
      public/tags/scripting/index.html
  20. +343
    -0
      public/tags/scripting/index.xml
  21. +1
    -0
      public/tags/scripting/page/1/index.html
  22. BIN
      static/images/glasses.png
  23. BIN
      static/images/supercharge-your-bash-scripts-with-multiprocessing.png

+ 14
- 13
content/posts/supercharge-your-bash-scripts-with-multiprocessing.md View File

@ -3,7 +3,7 @@ title = "Supercharge Your Bash Scripts with Multiprocessing"
date = "2021-05-05T17:08:12+03:00"
author = "Yigit Colakoglu"
authorTwitter = "theFr1nge"
cover = ""
cover = "images/supercharge-your-bash-scripts-with-multiprocessing.png"
tags = ["bash", "scripting", "programming"]
keywords = ["bash", "scripting"]
description = "Bash is a great tool for automating tasks and improving you work flow. However, it is ***SLOW***. Adding multiprocessing to the scripts you write can improve the performance greatly."
@ -35,7 +35,7 @@ process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:
```bash
{{< code language="bash" id="1" expand="Show" collapse="Hide" isCollapsed="false" >}}
foo="yeet"
function run_in_background(){
@ -47,7 +47,7 @@ run_in_background & # Spawn the function run_in_background in the background
foo="YEET"
echo "The value of foo changed to $foo."
wait # wait for the background process to finish
```
{{< /code >}}
This should output:
@ -67,14 +67,14 @@ efficient route first before moving on to the big boy implementation. Let's open
First of all, let's write a very simple function that allows us to easily test
our implementation:
```bash
{{< code language="bash" id="1" expand="Show" collapse="Hide" isCollapsed="false" >}}
function tester(){
# A function that takes an int as a parameter and sleeps
echo "$1"
sleep "$1"
echo "ENDED $1"
}
```
{{< /code >}}
Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That's because
@ -82,7 +82,7 @@ each system has a maximum number of processes that can be spawned (You can find
that out with the command `ulimit -u`). In our case, we want to limit the
processes being ran to the variable `num_processes`. Here is the implementation:
```bash
{{< code language="bash" id="1" expand="Show" collapse="Hide" isCollapsed="false" >}}
num_processes=$1
pcount=0
for i in {1..10}; do
@ -90,7 +90,7 @@ for i in {1..10}; do
((pcount++==0)) && wait
tester $i &
done
```
{{< /code >}}
What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs `tester` in that many processes. Go ahead and test it out!
@ -113,7 +113,8 @@ continuously pick up jobs from the job pool not waiting for any other process to
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.
```bash
{{< code language="bash" id="1" expand="Show" collapse="Hide" isCollapsed="false" >}}
job_pool_end_of_jobs="NO_JOB_LEFT"
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
@ -203,7 +204,7 @@ function job_pool_wait()
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
```
{{< /code >}}
Ok... But that the actual fuck is going in here???
@ -219,7 +220,6 @@ their purposes, shall we?
fifo's man page tells us that:
```
NAME
fifo - first-in first-out special file, named pipe
@ -233,7 +233,7 @@ DESCRIPTION
that processes can access the pipe using a name in the filesystem.
```
So put in **very** simple terms, a fifo is a named pipe that can allows
So put in **very** simple terms, a fifo is a named pipe that allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with `read cmd args < ${job_queue}`, the job is out of the pipe and the next
@ -285,7 +285,8 @@ inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our "chad" job pool script.
```bash
{{< code language="bash" id="1" expand="Show" collapse="Hide" isCollapsed="false" >}}
function tester(){
# A function that takes an int as a parameter and sleeps
echo "$1"
@ -302,7 +303,7 @@ done
job_pool_wait
job_pool_shutdown
```
{{< /code >}}
Hopefully this article was(or will be) helpful to you. From now on, you don't
ever have to write single threaded bash scripts like normies :)

+ 20
- 1
public/about/index.html View File

@ -176,7 +176,26 @@ hit me up through social media, I am open to chat :)</p>
<div id="disqus_thread"></div>
<script type="application/javascript">
var disqus_config = function () {
};
(function() {
if (["localhost", "127.0.0.1"].indexOf(window.location.hostname) != -1) {
document.getElementById('disqus_thread').innerHTML = 'Disqus comments not available by default when the website is previewed locally.';
return;
}
var d = document, s = d.createElement('script'); s.async = true;
s.src = '//' + "fr1nge-xyz" + '.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</div>


+ 20
- 1
public/awards/index.html View File

@ -165,7 +165,26 @@
<div id="disqus_thread"></div>
<script type="application/javascript">
var disqus_config = function () {
};
(function() {
if (["localhost", "127.0.0.1"].indexOf(window.location.hostname) != -1) {
document.getElementById('disqus_thread').innerHTML = 'Disqus comments not available by default when the website is previewed locally.';
return;
}
var d = document, s = d.createElement('script'); s.async = true;
s.src = '//' + "fr1nge-xyz" + '.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</div>


+ 42
- 1
public/index.html View File

@ -1,7 +1,7 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta name="generator" content="Hugo 0.82.0" />
<meta name="generator" content="Hugo 0.83.1" />
<title>Fr1nge&#39;s Personal Blog</title>
@ -146,6 +146,47 @@
<div class="post on-list">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a>
</h1>
<div class="post-meta">
<span class="post-date">
2021-05-05
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
<div class="post-content">
Bash is a great tool for automating tasks and improving you work flow. However, it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve the performance greatly.
</div>
<div>
<a class="read-more button"
href="/posts/supercharge-your-bash-scripts-with-multiprocessing/">Read more →</a>
</div>
</div>
<div class="pagination">
<div class="pagination__buttons">


+ 333
- 1
public/index.xml View File

@ -6,7 +6,339 @@
<description>Recent content on Fr1nge&#39;s Personal Blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright><atom:link href="http://fr1nge.xyz/index.xml" rel="self" type="application/rss+xml" />
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Supercharge Your Bash Scripts with Multiprocessing</title>
<link>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</guid>
<description>Bash is a great tool for automating tasks and improving you work flow. However, it is SLOW. Adding multiprocessing to the scripts you write can improve the performance greatly.
What is multiprocessing? In the simplest terms, multiprocessing is the principle of splitting the computations or jobs that a script has to do and running them on different processes. In even simpler terms however, multiprocessing is the computer science equivalent of hiring more than one worker when you are constructing a building.</description>
<content>&lt;p&gt;Bash is a great tool for automating tasks and improving you work flow. However,
it is &lt;em&gt;&lt;strong&gt;SLOW&lt;/strong&gt;&lt;/em&gt;. Adding multiprocessing to the scripts you write can improve
the performance greatly.&lt;/p&gt;
&lt;h2 id=&#34;what-is-multiprocessing&#34;&gt;What is multiprocessing?&lt;/h2&gt;
&lt;p&gt;In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.&lt;/p&gt;
&lt;h3 id=&#34;introducing-&#34;&gt;Introducing &amp;ldquo;&amp;amp;&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;While implementing multiprocessing the sign &lt;code&gt;&amp;amp;&lt;/code&gt; is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What &lt;code&gt;&amp;amp;&lt;/code&gt; does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
foo=&amp;#34;yeet&amp;#34;
function run_in_background(){
sleep 0.5
echo &amp;#34;The value of foo in the function run_in_background is $foo&amp;#34;
}
run_in_background &amp;amp; # Spawn the function run_in_background in the background
foo=&amp;#34;YEET&amp;#34;
echo &amp;#34;The value of foo changed to $foo.&amp;#34;
wait # wait for the background process to finish
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This should output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;The value of foo changed to YEET.
The value of foo in here is yeet
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, the value of &lt;code&gt;foo&lt;/code&gt; did not change in the background process even though
we changed it in the main function.&lt;/p&gt;
&lt;h2 id=&#34;baby-steps&#34;&gt;Baby steps&amp;hellip;&lt;/h2&gt;
&lt;p&gt;Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&amp;rsquo;s open up vim and get to scripting!
First of all, let&amp;rsquo;s write a very simple function that allows us to easily test
our implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&amp;rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command &lt;code&gt;ulimit -u&lt;/code&gt;). In our case, we want to limit the
processes being ran to the variable &lt;code&gt;num_processes&lt;/code&gt;. Here is the implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&amp;#43;&amp;#43;==0)) &amp;amp;&amp;amp; wait
tester $i &amp;amp;
done
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs &lt;code&gt;tester&lt;/code&gt; in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the &lt;code&gt;num_processes&lt;/code&gt; variable. The reason this happens is because
every time we spawn &lt;code&gt;num_processes&lt;/code&gt; processes, we &lt;code&gt;wait&lt;/code&gt; for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&amp;rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.&lt;/p&gt;
&lt;h2 id=&#34;real-chads-use-job-pools&#34;&gt;Real Chads use Job Pools&lt;/h2&gt;
&lt;p&gt;The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
job_pool_end_of_jobs=&amp;#34;NO_JOB_LEFT&amp;#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&amp;lt;&amp;gt; ${job_queue}
while [[ &amp;#34;${cmd}&amp;#34; != &amp;#34;${job_pool_end_of_jobs}&amp;#34; &amp;amp;&amp;amp; -e &amp;#34;${job_queue}&amp;#34; ]]; do
flock --exclusive 7
IFS=$&amp;#39;\v&amp;#39;
read cmd args &amp;lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &amp;#34;${cmd}&amp;#34; == &amp;#34;${job_pool_end_of_jobs}&amp;#34; ]]; then
echo &amp;#34;${cmd}&amp;#34; &amp;gt;&amp;amp;7
else
{ ${cmd} &amp;#34;$@&amp;#34; ; }
fi
done
exec 7&amp;gt;&amp;amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &amp;gt;&amp;gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&amp;lt;${job_pool_pool_size}; i&amp;#43;&amp;#43;)); do
job_pool_worker ${i} ${job_queue} &amp;amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &amp;gt;${job_pool_progress} &amp;amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &amp;#34;${job_pool_pool_size}&amp;#34; == &amp;#34;-1&amp;#34; ]]; then
job_pool_init
fi
printf &amp;#34;%s\v&amp;#34; &amp;#34;$@&amp;#34; &amp;gt;&amp;gt; ${job_pool_job_queue}
echo &amp;gt;&amp;gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Ok&amp;hellip; But that the actual fuck is going in here???&lt;/p&gt;
&lt;h3 id=&#34;fifo-and-flock&#34;&gt;fifo and flock&lt;/h3&gt;
&lt;p&gt;In order to understand what this code is doing, you first need to understand two
key commands that we are using, &lt;code&gt;fifo&lt;/code&gt; and &lt;code&gt;flock&lt;/code&gt;. Despite their complicated
names, they are actually quite simple. Let&amp;rsquo;s check their man pages to figure out
their purposes, shall we?&lt;/p&gt;
&lt;h4 id=&#34;man-fifo&#34;&gt;man fifo&lt;/h4&gt;
&lt;p&gt;fifo&amp;rsquo;s man page tells us that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So put in &lt;strong&gt;very&lt;/strong&gt; simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with &lt;code&gt;read cmd args &amp;lt; ${job_queue}&lt;/code&gt;, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&amp;rsquo;t want that. So we resort
to using &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&#34;man-flock&#34;&gt;man flock&lt;/h4&gt;
&lt;p&gt;flock&amp;rsquo;s man page defines it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Cool, translated to modern English that us regular folks use, &lt;code&gt;flock&lt;/code&gt; is a thin
wrapper around the C standard function &lt;code&gt;flock&lt;/code&gt; (see &lt;code&gt;man 2 flock&lt;/code&gt; if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its &lt;strong&gt;file descriptor number&lt;/strong&gt;. Aha! so that was the purpose of the &lt;code&gt;exec 7&amp;lt;&amp;gt; ${job_queue}&lt;/code&gt; calls in the &lt;code&gt;job_pool_worker&lt;/code&gt; function. It would essentially
assign the file descriptor 7 to the fifo &lt;code&gt;job_queue&lt;/code&gt; and afterwards lock it with
&lt;code&gt;flock --exclusive 7&lt;/code&gt;. Cool. This way only one process at a time can read from
the fifo &lt;code&gt;job_queue&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;great-but-how-do-i-use-this&#34;&gt;Great! But how do I use this?&lt;/h2&gt;
&lt;p&gt;It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &amp;ldquo;chad&amp;rdquo; job pool script.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &amp;#34;$i&amp;#34;
done
job_pool_wait
job_pool_shutdown
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hopefully this article was(or will be) helpful to you. From now on, you don&amp;rsquo;t
ever have to write single threaded bash scripts like normies :)&lt;/p&gt;
</content>
</item>
<item>
<title>$ ls awards/ certificates/</title>
<link>http://fr1nge.xyz/awards/</link>


+ 10
- 6
public/posts/index.html View File

@ -138,21 +138,25 @@
<div class="post on-list">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/test/">Test</a>
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a>
</h1>
<div class="post-meta">
<span class="post-date">
2021-04-13
2021-05-05
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/"></a>&nbsp;
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/"></a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
@ -161,14 +165,14 @@
<div class="post-content">
Bash is a great tool for automating tasks and improving you work flow. However, it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve the performance greatly.
</div>
<div>
<a class="read-more button"
href="/posts/test/">Read more →</a>
href="/posts/supercharge-your-bash-scripts-with-multiprocessing/">Read more →</a>
</div>
</div>


+ 328
- 7
public/posts/index.xml View File

@ -7,15 +7,336 @@
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Tue, 13 Apr 2021 23:26:07 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/posts/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/posts/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Test</title>
<link>http://fr1nge.xyz/posts/test/</link>
<pubDate>Tue, 13 Apr 2021 23:26:07 +0300</pubDate>
<title>Supercharge Your Bash Scripts with Multiprocessing</title>
<link>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/posts/test/</guid>
<description></description>
<content></content>
<guid>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</guid>
<description>Bash is a great tool for automating tasks and improving you work flow. However, it is SLOW. Adding multiprocessing to the scripts you write can improve the performance greatly.
What is multiprocessing? In the simplest terms, multiprocessing is the principle of splitting the computations or jobs that a script has to do and running them on different processes. In even simpler terms however, multiprocessing is the computer science equivalent of hiring more than one worker when you are constructing a building.</description>
<content>&lt;p&gt;Bash is a great tool for automating tasks and improving you work flow. However,
it is &lt;em&gt;&lt;strong&gt;SLOW&lt;/strong&gt;&lt;/em&gt;. Adding multiprocessing to the scripts you write can improve
the performance greatly.&lt;/p&gt;
&lt;h2 id=&#34;what-is-multiprocessing&#34;&gt;What is multiprocessing?&lt;/h2&gt;
&lt;p&gt;In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.&lt;/p&gt;
&lt;h3 id=&#34;introducing-&#34;&gt;Introducing &amp;ldquo;&amp;amp;&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;While implementing multiprocessing the sign &lt;code&gt;&amp;amp;&lt;/code&gt; is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What &lt;code&gt;&amp;amp;&lt;/code&gt; does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
foo=&amp;#34;yeet&amp;#34;
function run_in_background(){
sleep 0.5
echo &amp;#34;The value of foo in the function run_in_background is $foo&amp;#34;
}
run_in_background &amp;amp; # Spawn the function run_in_background in the background
foo=&amp;#34;YEET&amp;#34;
echo &amp;#34;The value of foo changed to $foo.&amp;#34;
wait # wait for the background process to finish
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This should output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;The value of foo changed to YEET.
The value of foo in here is yeet
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, the value of &lt;code&gt;foo&lt;/code&gt; did not change in the background process even though
we changed it in the main function.&lt;/p&gt;
&lt;h2 id=&#34;baby-steps&#34;&gt;Baby steps&amp;hellip;&lt;/h2&gt;
&lt;p&gt;Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&amp;rsquo;s open up vim and get to scripting!
First of all, let&amp;rsquo;s write a very simple function that allows us to easily test
our implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&amp;rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command &lt;code&gt;ulimit -u&lt;/code&gt;). In our case, we want to limit the
processes being ran to the variable &lt;code&gt;num_processes&lt;/code&gt;. Here is the implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&amp;#43;&amp;#43;==0)) &amp;amp;&amp;amp; wait
tester $i &amp;amp;
done
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs &lt;code&gt;tester&lt;/code&gt; in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the &lt;code&gt;num_processes&lt;/code&gt; variable. The reason this happens is because
every time we spawn &lt;code&gt;num_processes&lt;/code&gt; processes, we &lt;code&gt;wait&lt;/code&gt; for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&amp;rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.&lt;/p&gt;
&lt;h2 id=&#34;real-chads-use-job-pools&#34;&gt;Real Chads use Job Pools&lt;/h2&gt;
&lt;p&gt;The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
job_pool_end_of_jobs=&amp;#34;NO_JOB_LEFT&amp;#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&amp;lt;&amp;gt; ${job_queue}
while [[ &amp;#34;${cmd}&amp;#34; != &amp;#34;${job_pool_end_of_jobs}&amp;#34; &amp;amp;&amp;amp; -e &amp;#34;${job_queue}&amp;#34; ]]; do
flock --exclusive 7
IFS=$&amp;#39;\v&amp;#39;
read cmd args &amp;lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &amp;#34;${cmd}&amp;#34; == &amp;#34;${job_pool_end_of_jobs}&amp;#34; ]]; then
echo &amp;#34;${cmd}&amp;#34; &amp;gt;&amp;amp;7
else
{ ${cmd} &amp;#34;$@&amp;#34; ; }
fi
done
exec 7&amp;gt;&amp;amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &amp;gt;&amp;gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&amp;lt;${job_pool_pool_size}; i&amp;#43;&amp;#43;)); do
job_pool_worker ${i} ${job_queue} &amp;amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &amp;gt;${job_pool_progress} &amp;amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &amp;#34;${job_pool_pool_size}&amp;#34; == &amp;#34;-1&amp;#34; ]]; then
job_pool_init
fi
printf &amp;#34;%s\v&amp;#34; &amp;#34;$@&amp;#34; &amp;gt;&amp;gt; ${job_pool_job_queue}
echo &amp;gt;&amp;gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Ok&amp;hellip; But that the actual fuck is going in here???&lt;/p&gt;
&lt;h3 id=&#34;fifo-and-flock&#34;&gt;fifo and flock&lt;/h3&gt;
&lt;p&gt;In order to understand what this code is doing, you first need to understand two
key commands that we are using, &lt;code&gt;fifo&lt;/code&gt; and &lt;code&gt;flock&lt;/code&gt;. Despite their complicated
names, they are actually quite simple. Let&amp;rsquo;s check their man pages to figure out
their purposes, shall we?&lt;/p&gt;
&lt;h4 id=&#34;man-fifo&#34;&gt;man fifo&lt;/h4&gt;
&lt;p&gt;fifo&amp;rsquo;s man page tells us that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So put in &lt;strong&gt;very&lt;/strong&gt; simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with &lt;code&gt;read cmd args &amp;lt; ${job_queue}&lt;/code&gt;, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&amp;rsquo;t want that. So we resort
to using &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&#34;man-flock&#34;&gt;man flock&lt;/h4&gt;
&lt;p&gt;flock&amp;rsquo;s man page defines it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Cool, translated to modern English that us regular folks use, &lt;code&gt;flock&lt;/code&gt; is a thin
wrapper around the C standard function &lt;code&gt;flock&lt;/code&gt; (see &lt;code&gt;man 2 flock&lt;/code&gt; if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its &lt;strong&gt;file descriptor number&lt;/strong&gt;. Aha! so that was the purpose of the &lt;code&gt;exec 7&amp;lt;&amp;gt; ${job_queue}&lt;/code&gt; calls in the &lt;code&gt;job_pool_worker&lt;/code&gt; function. It would essentially
assign the file descriptor 7 to the fifo &lt;code&gt;job_queue&lt;/code&gt; and afterwards lock it with
&lt;code&gt;flock --exclusive 7&lt;/code&gt;. Cool. This way only one process at a time can read from
the fifo &lt;code&gt;job_queue&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;great-but-how-do-i-use-this&#34;&gt;Great! But how do I use this?&lt;/h2&gt;
&lt;p&gt;It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &amp;ldquo;chad&amp;rdquo; job pool script.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &amp;#34;$i&amp;#34;
done
job_pool_wait
job_pool_shutdown
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hopefully this article was(or will be) helpful to you. From now on, you don&amp;rsquo;t
ever have to write single threaded bash scripts like normies :)&lt;/p&gt;
</content>
</item>
</channel>


+ 542
- 0
public/posts/supercharge-your-bash-scripts-with-multiprocessing/index.html View File

@ -0,0 +1,542 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>Supercharge Your Bash Scripts with Multiprocessing :: Fr1nge&#39;s Personal Blog</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Bash is a great tool for automating tasks and improving you work flow. However, it is ***SLOW***. Adding multiprocessing to the scripts you write can improve the performance greatly." />
<meta name="keywords" content="bash, scripting" />
<meta name="robots" content="noodp" />
<link rel="canonical" href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/" />
<link rel="stylesheet" href="http://fr1nge.xyz/assets/style.css">
<link rel="stylesheet" href="http://fr1nge.xyz/assets/blue.css">
<link rel="apple-touch-icon" href="http://fr1nge.xyz/img/apple-touch-icon-192x192.png">
<link rel="shortcut icon" href="http://fr1nge.xyz/img/favicon/blue.png">
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="yigitcolakoglu.com" />
<meta name="twitter:creator" content="theFr1nge" />
<meta property="og:locale" content="en" />
<meta property="og:type" content="article" />
<meta property="og:title" content="Supercharge Your Bash Scripts with Multiprocessing">
<meta property="og:description" content="Bash is a great tool for automating tasks and improving you work flow. However, it is ***SLOW***. Adding multiprocessing to the scripts you write can improve the performance greatly." />
<meta property="og:url" content="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/" />
<meta property="og:site_name" content="Fr1nge&#39;s Personal Blog" />
<meta property="og:image" content="http://fr1nge.xyz/">
<meta property="og:image:width" content="2048">
<meta property="og:image:height" content="1024">
<meta property="article:published_time" content="2021-05-05 17:08:12 &#43;0300 &#43;03" />
</head>
<body class="blue">
<div class="container center headings--one-size">
<header class="header">
<div class="header__inner">
<div class="header__logo">
<a href="/">
<div class="logo">
fr1nge.xyz
</div>
</a>
</div>
<div class="menu-trigger">menu</div>
</div>
<nav class="menu">
<ul class="menu__inner menu__inner--desktop">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
<ul class="menu__inner menu__inner--mobile">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
</nav>
</header>
<div class="content">
<div class="post">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a></h1>
<div class="post-meta">
<span class="post-date">
2021-05-05 [Updated: 2021-05-05]
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
<div class="post-content"><div>
<p>Bash is a great tool for automating tasks and improving you work flow. However,
it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve
the performance greatly.</p>
<h2 id="what-is-multiprocessing">What is multiprocessing?<a href="#what-is-multiprocessing" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h2>
<p>In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.</p>
<h3 id="introducing-">Introducing &ldquo;&amp;&rdquo;<a href="#introducing-" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h3>
<p>While implementing multiprocessing the sign <code>&amp;</code> is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What <code>&amp;</code> does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:</p>
<div class="collapsable-code">
<input id="1" type="checkbox" />
<label for="1">
<span class="collapsable-code__language">bash</span>
<span class="collapsable-code__toggle" data-label-expand="Show" data-label-collapse="Hide"></span>
</label>
<pre class="language-bash" ><code>
foo=&#34;yeet&#34;
function run_in_background(){
sleep 0.5
echo &#34;The value of foo in the function run_in_background is $foo&#34;
}
run_in_background &amp; # Spawn the function run_in_background in the background
foo=&#34;YEET&#34;
echo &#34;The value of foo changed to $foo.&#34;
wait # wait for the background process to finish
</code></pre>
</div>
<p>This should output:</p>
<pre><code>The value of foo changed to YEET.
The value of foo in here is yeet
</code></pre><p>As you can see, the value of <code>foo</code> did not change in the background process even though
we changed it in the main function.</p>
<h2 id="baby-steps">Baby steps&hellip;<a href="#baby-steps" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h2>
<p>Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&rsquo;s open up vim and get to scripting!
First of all, let&rsquo;s write a very simple function that allows us to easily test
our implementation:</p>
<div class="collapsable-code">
<input id="1" type="checkbox" />
<label for="1">
<span class="collapsable-code__language">bash</span>
<span class="collapsable-code__toggle" data-label-expand="Show" data-label-collapse="Hide"></span>
</label>
<pre class="language-bash" ><code>
function tester(){
# A function that takes an int as a parameter and sleeps
echo &#34;$1&#34;
sleep &#34;$1&#34;
echo &#34;ENDED $1&#34;
}
</code></pre>
</div>
<p>Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command <code>ulimit -u</code>). In our case, we want to limit the
processes being ran to the variable <code>num_processes</code>. Here is the implementation:</p>
<div class="collapsable-code">
<input id="1" type="checkbox" />
<label for="1">
<span class="collapsable-code__language">bash</span>
<span class="collapsable-code__toggle" data-label-expand="Show" data-label-collapse="Hide"></span>
</label>
<pre class="language-bash" ><code>
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&#43;&#43;==0)) &amp;&amp; wait
tester $i &amp;
done
</code></pre>
</div>
<p>What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs <code>tester</code> in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the <code>num_processes</code> variable. The reason this happens is because
every time we spawn <code>num_processes</code> processes, we <code>wait</code> for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.</p>
<h2 id="real-chads-use-job-pools">Real Chads use Job Pools<a href="#real-chads-use-job-pools" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h2>
<p>The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.</p>
<div class="collapsable-code">
<input id="1" type="checkbox" />
<label for="1">
<span class="collapsable-code__language">bash</span>
<span class="collapsable-code__toggle" data-label-expand="Show" data-label-collapse="Hide"></span>
</label>
<pre class="language-bash" ><code>
job_pool_end_of_jobs=&#34;NO_JOB_LEFT&#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&lt;&gt; ${job_queue}
while [[ &#34;${cmd}&#34; != &#34;${job_pool_end_of_jobs}&#34; &amp;&amp; -e &#34;${job_queue}&#34; ]]; do
flock --exclusive 7
IFS=$&#39;\v&#39;
read cmd args &lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &#34;${cmd}&#34; == &#34;${job_pool_end_of_jobs}&#34; ]]; then
echo &#34;${cmd}&#34; &gt;&amp;7
else
{ ${cmd} &#34;$@&#34; ; }
fi
done
exec 7&gt;&amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &gt;&gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&lt;${job_pool_pool_size}; i&#43;&#43;)); do
job_pool_worker ${i} ${job_queue} &amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &gt;${job_pool_progress} &amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &#34;${job_pool_pool_size}&#34; == &#34;-1&#34; ]]; then
job_pool_init
fi
printf &#34;%s\v&#34; &#34;$@&#34; &gt;&gt; ${job_pool_job_queue}
echo &gt;&gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
</code></pre>
</div>
<p>Ok&hellip; But that the actual fuck is going in here???</p>
<h3 id="fifo-and-flock">fifo and flock<a href="#fifo-and-flock" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h3>
<p>In order to understand what this code is doing, you first need to understand two
key commands that we are using, <code>fifo</code> and <code>flock</code>. Despite their complicated
names, they are actually quite simple. Let&rsquo;s check their man pages to figure out
their purposes, shall we?</p>
<h4 id="man-fifo">man fifo<a href="#man-fifo" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h4>
<p>fifo&rsquo;s man page tells us that:</p>
<pre><code>NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
</code></pre><p>So put in <strong>very</strong> simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with <code>read cmd args &lt; ${job_queue}</code>, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&rsquo;t want that. So we resort
to using <code>flock</code>.</p>
<h4 id="man-flock">man flock<a href="#man-flock" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h4>
<p>flock&rsquo;s man page defines it as:</p>
<pre><code> SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
</code></pre><p>Cool, translated to modern English that us regular folks use, <code>flock</code> is a thin
wrapper around the C standard function <code>flock</code> (see <code>man 2 flock</code> if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its <strong>file descriptor number</strong>. Aha! so that was the purpose of the <code>exec 7&lt;&gt; ${job_queue}</code> calls in the <code>job_pool_worker</code> function. It would essentially
assign the file descriptor 7 to the fifo <code>job_queue</code> and afterwards lock it with
<code>flock --exclusive 7</code>. Cool. This way only one process at a time can read from
the fifo <code>job_queue</code></p>
<h2 id="great-but-how-do-i-use-this">Great! But how do I use this?<a href="#great-but-how-do-i-use-this" class="hanchor" ariaLabel="Anchor">&#8983;</a> </h2>
<p>It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &ldquo;chad&rdquo; job pool script.</p>
<div class="collapsable-code">
<input id="1" type="checkbox" />
<label for="1">
<span class="collapsable-code__language">bash</span>
<span class="collapsable-code__toggle" data-label-expand="Show" data-label-collapse="Hide"></span>
</label>
<pre class="language-bash" ><code>
function tester(){
# A function that takes an int as a parameter and sleeps
echo &#34;$1&#34;
sleep &#34;$1&#34;
echo &#34;ENDED $1&#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &#34;$i&#34;
done
job_pool_wait
job_pool_shutdown
</code></pre>
</div>
<p>Hopefully this article was(or will be) helpful to you. From now on, you don&rsquo;t
ever have to write single threaded bash scripts like normies :)</p>
</div></div>
<div id="disqus_thread"></div>
<script type="application/javascript">
var disqus_config = function () {
};
(function() {
if (["localhost", "127.0.0.1"].indexOf(window.location.hostname) != -1) {
document.getElementById('disqus_thread').innerHTML = 'Disqus comments not available by default when the website is previewed locally.';
return;
}
var d = document, s = d.createElement('script'); s.async = true;
s.src = '//' + "fr1nge-xyz" + '.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</div>
</div>
<footer class="footer">
<div class="footer__inner">
<div class="copyright copyright--user">
<span>Yigit Colakoglu</span>
<span>:: Theme made by <a href="https://twitter.com/panr">panr</a></span>
</div>
</div>
</footer>
<script src="http://fr1nge.xyz/assets/main.js"></script>
<script src="http://fr1nge.xyz/assets/prism.js"></script>
</div>
</body>
</html>

+ 20
- 1
public/projects/index.html View File

@ -160,7 +160,26 @@
<div id="disqus_thread"></div>
<script type="application/javascript">
var disqus_config = function () {
};
(function() {
if (["localhost", "127.0.0.1"].indexOf(window.location.hostname) != -1) {
document.getElementById('disqus_thread').innerHTML = 'Disqus comments not available by default when the website is previewed locally.';
return;
}
var d = document, s = d.createElement('script'); s.async = true;
s.src = '//' + "fr1nge-xyz" + '.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</div>


+ 21
- 4
public/sitemap.xml View File

@ -2,6 +2,27 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>http://fr1nge.xyz/tags/bash/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/posts/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/tags/programming/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/tags/scripting/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/tags/</loc>
<lastmod>2021-05-05T17:08:12+03:00</lastmod>
</url><url>
<loc>http://fr1nge.xyz/awards/</loc>
</url><url>
<loc>http://fr1nge.xyz/projects/</loc>
@ -9,9 +30,5 @@
<loc>http://fr1nge.xyz/about/</loc>
</url><url>
<loc>http://fr1nge.xyz/categories/</loc>
</url><url>
<loc>http://fr1nge.xyz/</loc>
</url><url>
<loc>http://fr1nge.xyz/tags/</loc>
</url>
</urlset>

+ 216
- 0
public/tags/bash/index.html View File

@ -0,0 +1,216 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>bash :: Fr1nge&#39;s Personal Blog</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="" />
<meta name="keywords" content="Tech, Linux, Programming, Security" />
<meta name="robots" content="noodp" />
<link rel="canonical" href="http://fr1nge.xyz/tags/bash/" />
<link rel="stylesheet" href="http://fr1nge.xyz/assets/style.css">
<link rel="stylesheet" href="http://fr1nge.xyz/assets/blue.css">
<link rel="apple-touch-icon" href="http://fr1nge.xyz/img/apple-touch-icon-192x192.png">
<link rel="shortcut icon" href="http://fr1nge.xyz/img/favicon/blue.png">
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="yigitcolakoglu.com" />
<meta name="twitter:creator" content="" />
<meta property="og:locale" content="en" />
<meta property="og:type" content="website" />
<meta property="og:title" content="bash">
<meta property="og:description" content="" />
<meta property="og:url" content="http://fr1nge.xyz/tags/bash/" />
<meta property="og:site_name" content="Fr1nge&#39;s Personal Blog" />
<meta property="og:image" content="http://fr1nge.xyz/img/favicon/blue.png">
<meta property="og:image:width" content="2048">
<meta property="og:image:height" content="1024">
<link href="/tags/bash/index.xml" rel="alternate" type="application/rss+xml" title="Fr1nge&#39;s Personal Blog" />
</head>
<body class="blue">
<div class="container center headings--one-size">
<header class="header">
<div class="header__inner">
<div class="header__logo">
<a href="/">
<div class="logo">
fr1nge.xyz
</div>
</a>
</div>
<div class="menu-trigger">menu</div>
</div>
<nav class="menu">
<ul class="menu__inner menu__inner--desktop">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
<ul class="menu__inner menu__inner--mobile">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
</nav>
</header>
<div class="content">
<div class="posts">
<div class="post on-list">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a>
</h1>
<div class="post-meta">
<span class="post-date">
2021-05-05
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
<div class="post-content">
Bash is a great tool for automating tasks and improving you work flow. However, it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve the performance greatly.
</div>
<div>
<a class="read-more button"
href="/posts/supercharge-your-bash-scripts-with-multiprocessing/">Read more →</a>
</div>
</div>
<div class="pagination">
<div class="pagination__buttons">
</div>
</div>
</div>
</div>
<footer class="footer">
<div class="footer__inner">
<div class="copyright copyright--user">
<span>Yigit Colakoglu</span>
<span>:: Theme made by <a href="https://twitter.com/panr">panr</a></span>
</div>
</div>
</footer>
<script src="http://fr1nge.xyz/assets/main.js"></script>
<script src="http://fr1nge.xyz/assets/prism.js"></script>
</div>
</body>
</html>

+ 343
- 0
public/tags/bash/index.xml View File

@ -0,0 +1,343 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>bash on Fr1nge&#39;s Personal Blog</title>
<link>http://fr1nge.xyz/tags/bash/</link>
<description>Recent content in bash on Fr1nge&#39;s Personal Blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/tags/bash/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Supercharge Your Bash Scripts with Multiprocessing</title>
<link>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</guid>
<description>Bash is a great tool for automating tasks and improving you work flow. However, it is SLOW. Adding multiprocessing to the scripts you write can improve the performance greatly.
What is multiprocessing? In the simplest terms, multiprocessing is the principle of splitting the computations or jobs that a script has to do and running them on different processes. In even simpler terms however, multiprocessing is the computer science equivalent of hiring more than one worker when you are constructing a building.</description>
<content>&lt;p&gt;Bash is a great tool for automating tasks and improving you work flow. However,
it is &lt;em&gt;&lt;strong&gt;SLOW&lt;/strong&gt;&lt;/em&gt;. Adding multiprocessing to the scripts you write can improve
the performance greatly.&lt;/p&gt;
&lt;h2 id=&#34;what-is-multiprocessing&#34;&gt;What is multiprocessing?&lt;/h2&gt;
&lt;p&gt;In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.&lt;/p&gt;
&lt;h3 id=&#34;introducing-&#34;&gt;Introducing &amp;ldquo;&amp;amp;&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;While implementing multiprocessing the sign &lt;code&gt;&amp;amp;&lt;/code&gt; is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What &lt;code&gt;&amp;amp;&lt;/code&gt; does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
foo=&amp;#34;yeet&amp;#34;
function run_in_background(){
sleep 0.5
echo &amp;#34;The value of foo in the function run_in_background is $foo&amp;#34;
}
run_in_background &amp;amp; # Spawn the function run_in_background in the background
foo=&amp;#34;YEET&amp;#34;
echo &amp;#34;The value of foo changed to $foo.&amp;#34;
wait # wait for the background process to finish
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This should output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;The value of foo changed to YEET.
The value of foo in here is yeet
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, the value of &lt;code&gt;foo&lt;/code&gt; did not change in the background process even though
we changed it in the main function.&lt;/p&gt;
&lt;h2 id=&#34;baby-steps&#34;&gt;Baby steps&amp;hellip;&lt;/h2&gt;
&lt;p&gt;Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&amp;rsquo;s open up vim and get to scripting!
First of all, let&amp;rsquo;s write a very simple function that allows us to easily test
our implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&amp;rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command &lt;code&gt;ulimit -u&lt;/code&gt;). In our case, we want to limit the
processes being ran to the variable &lt;code&gt;num_processes&lt;/code&gt;. Here is the implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&amp;#43;&amp;#43;==0)) &amp;amp;&amp;amp; wait
tester $i &amp;amp;
done
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs &lt;code&gt;tester&lt;/code&gt; in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the &lt;code&gt;num_processes&lt;/code&gt; variable. The reason this happens is because
every time we spawn &lt;code&gt;num_processes&lt;/code&gt; processes, we &lt;code&gt;wait&lt;/code&gt; for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&amp;rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.&lt;/p&gt;
&lt;h2 id=&#34;real-chads-use-job-pools&#34;&gt;Real Chads use Job Pools&lt;/h2&gt;
&lt;p&gt;The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
job_pool_end_of_jobs=&amp;#34;NO_JOB_LEFT&amp;#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&amp;lt;&amp;gt; ${job_queue}
while [[ &amp;#34;${cmd}&amp;#34; != &amp;#34;${job_pool_end_of_jobs}&amp;#34; &amp;amp;&amp;amp; -e &amp;#34;${job_queue}&amp;#34; ]]; do
flock --exclusive 7
IFS=$&amp;#39;\v&amp;#39;
read cmd args &amp;lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &amp;#34;${cmd}&amp;#34; == &amp;#34;${job_pool_end_of_jobs}&amp;#34; ]]; then
echo &amp;#34;${cmd}&amp;#34; &amp;gt;&amp;amp;7
else
{ ${cmd} &amp;#34;$@&amp;#34; ; }
fi
done
exec 7&amp;gt;&amp;amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &amp;gt;&amp;gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&amp;lt;${job_pool_pool_size}; i&amp;#43;&amp;#43;)); do
job_pool_worker ${i} ${job_queue} &amp;amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &amp;gt;${job_pool_progress} &amp;amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &amp;#34;${job_pool_pool_size}&amp;#34; == &amp;#34;-1&amp;#34; ]]; then
job_pool_init
fi
printf &amp;#34;%s\v&amp;#34; &amp;#34;$@&amp;#34; &amp;gt;&amp;gt; ${job_pool_job_queue}
echo &amp;gt;&amp;gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Ok&amp;hellip; But that the actual fuck is going in here???&lt;/p&gt;
&lt;h3 id=&#34;fifo-and-flock&#34;&gt;fifo and flock&lt;/h3&gt;
&lt;p&gt;In order to understand what this code is doing, you first need to understand two
key commands that we are using, &lt;code&gt;fifo&lt;/code&gt; and &lt;code&gt;flock&lt;/code&gt;. Despite their complicated
names, they are actually quite simple. Let&amp;rsquo;s check their man pages to figure out
their purposes, shall we?&lt;/p&gt;
&lt;h4 id=&#34;man-fifo&#34;&gt;man fifo&lt;/h4&gt;
&lt;p&gt;fifo&amp;rsquo;s man page tells us that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So put in &lt;strong&gt;very&lt;/strong&gt; simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with &lt;code&gt;read cmd args &amp;lt; ${job_queue}&lt;/code&gt;, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&amp;rsquo;t want that. So we resort
to using &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&#34;man-flock&#34;&gt;man flock&lt;/h4&gt;
&lt;p&gt;flock&amp;rsquo;s man page defines it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Cool, translated to modern English that us regular folks use, &lt;code&gt;flock&lt;/code&gt; is a thin
wrapper around the C standard function &lt;code&gt;flock&lt;/code&gt; (see &lt;code&gt;man 2 flock&lt;/code&gt; if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its &lt;strong&gt;file descriptor number&lt;/strong&gt;. Aha! so that was the purpose of the &lt;code&gt;exec 7&amp;lt;&amp;gt; ${job_queue}&lt;/code&gt; calls in the &lt;code&gt;job_pool_worker&lt;/code&gt; function. It would essentially
assign the file descriptor 7 to the fifo &lt;code&gt;job_queue&lt;/code&gt; and afterwards lock it with
&lt;code&gt;flock --exclusive 7&lt;/code&gt;. Cool. This way only one process at a time can read from
the fifo &lt;code&gt;job_queue&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;great-but-how-do-i-use-this&#34;&gt;Great! But how do I use this?&lt;/h2&gt;
&lt;p&gt;It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &amp;ldquo;chad&amp;rdquo; job pool script.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &amp;#34;$i&amp;#34;
done
job_pool_wait
job_pool_shutdown
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hopefully this article was(or will be) helpful to you. From now on, you don&amp;rsquo;t
ever have to write single threaded bash scripts like normies :)&lt;/p&gt;
</content>
</item>
</channel>
</rss>

+ 1
- 0
public/tags/bash/page/1/index.html View File

@ -0,0 +1 @@
<!DOCTYPE html><html><head><title>http://fr1nge.xyz/tags/bash/</title><link rel="canonical" href="http://fr1nge.xyz/tags/bash/"/><meta name="robots" content="noindex"><meta charset="utf-8" /><meta http-equiv="refresh" content="0; url=http://fr1nge.xyz/tags/bash/" /></head></html>

+ 24
- 0
public/tags/index.html View File

@ -138,6 +138,30 @@
<ul>
<li>
<a class="terms-title" href="http://fr1nge.xyz/tags/bash/">bash (1)</a>
</li>
<li>
<a class="terms-title" href="http://fr1nge.xyz/tags/programming/">programming (1)</a>
</li>
<li>
<a class="terms-title" href="http://fr1nge.xyz/tags/scripting/">scripting (1)</a>
</li>
</ul>
</div>


+ 32
- 1
public/tags/index.xml View File

@ -6,6 +6,37 @@
<description>Recent content in Tags on Fr1nge&#39;s Personal Blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright><atom:link href="http://fr1nge.xyz/tags/index.xml" rel="self" type="application/rss+xml" />
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/tags/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>bash</title>
<link>http://fr1nge.xyz/tags/bash/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/tags/bash/</guid>
<description></description>
<content></content>
</item>
<item>
<title>programming</title>
<link>http://fr1nge.xyz/tags/programming/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/tags/programming/</guid>
<description></description>
<content></content>
</item>
<item>
<title>scripting</title>
<link>http://fr1nge.xyz/tags/scripting/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/tags/scripting/</guid>
<description></description>
<content></content>
</item>
</channel>
</rss>

+ 216
- 0
public/tags/programming/index.html View File

@ -0,0 +1,216 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>programming :: Fr1nge&#39;s Personal Blog</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="" />
<meta name="keywords" content="Tech, Linux, Programming, Security" />
<meta name="robots" content="noodp" />
<link rel="canonical" href="http://fr1nge.xyz/tags/programming/" />
<link rel="stylesheet" href="http://fr1nge.xyz/assets/style.css">
<link rel="stylesheet" href="http://fr1nge.xyz/assets/blue.css">
<link rel="apple-touch-icon" href="http://fr1nge.xyz/img/apple-touch-icon-192x192.png">
<link rel="shortcut icon" href="http://fr1nge.xyz/img/favicon/blue.png">
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="yigitcolakoglu.com" />
<meta name="twitter:creator" content="" />
<meta property="og:locale" content="en" />
<meta property="og:type" content="website" />
<meta property="og:title" content="programming">
<meta property="og:description" content="" />
<meta property="og:url" content="http://fr1nge.xyz/tags/programming/" />
<meta property="og:site_name" content="Fr1nge&#39;s Personal Blog" />
<meta property="og:image" content="http://fr1nge.xyz/img/favicon/blue.png">
<meta property="og:image:width" content="2048">
<meta property="og:image:height" content="1024">
<link href="/tags/programming/index.xml" rel="alternate" type="application/rss+xml" title="Fr1nge&#39;s Personal Blog" />
</head>
<body class="blue">
<div class="container center headings--one-size">
<header class="header">
<div class="header__inner">
<div class="header__logo">
<a href="/">
<div class="logo">
fr1nge.xyz
</div>
</a>
</div>
<div class="menu-trigger">menu</div>
</div>
<nav class="menu">
<ul class="menu__inner menu__inner--desktop">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
<ul class="menu__inner menu__inner--mobile">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
</nav>
</header>
<div class="content">
<div class="posts">
<div class="post on-list">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a>
</h1>
<div class="post-meta">
<span class="post-date">
2021-05-05
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
<div class="post-content">
Bash is a great tool for automating tasks and improving you work flow. However, it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve the performance greatly.
</div>
<div>
<a class="read-more button"
href="/posts/supercharge-your-bash-scripts-with-multiprocessing/">Read more →</a>
</div>
</div>
<div class="pagination">
<div class="pagination__buttons">
</div>
</div>
</div>
</div>
<footer class="footer">
<div class="footer__inner">
<div class="copyright copyright--user">
<span>Yigit Colakoglu</span>
<span>:: Theme made by <a href="https://twitter.com/panr">panr</a></span>
</div>
</div>
</footer>
<script src="http://fr1nge.xyz/assets/main.js"></script>
<script src="http://fr1nge.xyz/assets/prism.js"></script>
</div>
</body>
</html>

+ 343
- 0
public/tags/programming/index.xml View File

@ -0,0 +1,343 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>programming on Fr1nge&#39;s Personal Blog</title>
<link>http://fr1nge.xyz/tags/programming/</link>
<description>Recent content in programming on Fr1nge&#39;s Personal Blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/tags/programming/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Supercharge Your Bash Scripts with Multiprocessing</title>
<link>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</guid>
<description>Bash is a great tool for automating tasks and improving you work flow. However, it is SLOW. Adding multiprocessing to the scripts you write can improve the performance greatly.
What is multiprocessing? In the simplest terms, multiprocessing is the principle of splitting the computations or jobs that a script has to do and running them on different processes. In even simpler terms however, multiprocessing is the computer science equivalent of hiring more than one worker when you are constructing a building.</description>
<content>&lt;p&gt;Bash is a great tool for automating tasks and improving you work flow. However,
it is &lt;em&gt;&lt;strong&gt;SLOW&lt;/strong&gt;&lt;/em&gt;. Adding multiprocessing to the scripts you write can improve
the performance greatly.&lt;/p&gt;
&lt;h2 id=&#34;what-is-multiprocessing&#34;&gt;What is multiprocessing?&lt;/h2&gt;
&lt;p&gt;In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.&lt;/p&gt;
&lt;h3 id=&#34;introducing-&#34;&gt;Introducing &amp;ldquo;&amp;amp;&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;While implementing multiprocessing the sign &lt;code&gt;&amp;amp;&lt;/code&gt; is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What &lt;code&gt;&amp;amp;&lt;/code&gt; does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
foo=&amp;#34;yeet&amp;#34;
function run_in_background(){
sleep 0.5
echo &amp;#34;The value of foo in the function run_in_background is $foo&amp;#34;
}
run_in_background &amp;amp; # Spawn the function run_in_background in the background
foo=&amp;#34;YEET&amp;#34;
echo &amp;#34;The value of foo changed to $foo.&amp;#34;
wait # wait for the background process to finish
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This should output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;The value of foo changed to YEET.
The value of foo in here is yeet
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, the value of &lt;code&gt;foo&lt;/code&gt; did not change in the background process even though
we changed it in the main function.&lt;/p&gt;
&lt;h2 id=&#34;baby-steps&#34;&gt;Baby steps&amp;hellip;&lt;/h2&gt;
&lt;p&gt;Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&amp;rsquo;s open up vim and get to scripting!
First of all, let&amp;rsquo;s write a very simple function that allows us to easily test
our implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&amp;rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command &lt;code&gt;ulimit -u&lt;/code&gt;). In our case, we want to limit the
processes being ran to the variable &lt;code&gt;num_processes&lt;/code&gt;. Here is the implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&amp;#43;&amp;#43;==0)) &amp;amp;&amp;amp; wait
tester $i &amp;amp;
done
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs &lt;code&gt;tester&lt;/code&gt; in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the &lt;code&gt;num_processes&lt;/code&gt; variable. The reason this happens is because
every time we spawn &lt;code&gt;num_processes&lt;/code&gt; processes, we &lt;code&gt;wait&lt;/code&gt; for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&amp;rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.&lt;/p&gt;
&lt;h2 id=&#34;real-chads-use-job-pools&#34;&gt;Real Chads use Job Pools&lt;/h2&gt;
&lt;p&gt;The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
job_pool_end_of_jobs=&amp;#34;NO_JOB_LEFT&amp;#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&amp;lt;&amp;gt; ${job_queue}
while [[ &amp;#34;${cmd}&amp;#34; != &amp;#34;${job_pool_end_of_jobs}&amp;#34; &amp;amp;&amp;amp; -e &amp;#34;${job_queue}&amp;#34; ]]; do
flock --exclusive 7
IFS=$&amp;#39;\v&amp;#39;
read cmd args &amp;lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &amp;#34;${cmd}&amp;#34; == &amp;#34;${job_pool_end_of_jobs}&amp;#34; ]]; then
echo &amp;#34;${cmd}&amp;#34; &amp;gt;&amp;amp;7
else
{ ${cmd} &amp;#34;$@&amp;#34; ; }
fi
done
exec 7&amp;gt;&amp;amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &amp;gt;&amp;gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&amp;lt;${job_pool_pool_size}; i&amp;#43;&amp;#43;)); do
job_pool_worker ${i} ${job_queue} &amp;amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &amp;gt;${job_pool_progress} &amp;amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &amp;#34;${job_pool_pool_size}&amp;#34; == &amp;#34;-1&amp;#34; ]]; then
job_pool_init
fi
printf &amp;#34;%s\v&amp;#34; &amp;#34;$@&amp;#34; &amp;gt;&amp;gt; ${job_pool_job_queue}
echo &amp;gt;&amp;gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Ok&amp;hellip; But that the actual fuck is going in here???&lt;/p&gt;
&lt;h3 id=&#34;fifo-and-flock&#34;&gt;fifo and flock&lt;/h3&gt;
&lt;p&gt;In order to understand what this code is doing, you first need to understand two
key commands that we are using, &lt;code&gt;fifo&lt;/code&gt; and &lt;code&gt;flock&lt;/code&gt;. Despite their complicated
names, they are actually quite simple. Let&amp;rsquo;s check their man pages to figure out
their purposes, shall we?&lt;/p&gt;
&lt;h4 id=&#34;man-fifo&#34;&gt;man fifo&lt;/h4&gt;
&lt;p&gt;fifo&amp;rsquo;s man page tells us that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So put in &lt;strong&gt;very&lt;/strong&gt; simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with &lt;code&gt;read cmd args &amp;lt; ${job_queue}&lt;/code&gt;, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&amp;rsquo;t want that. So we resort
to using &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&#34;man-flock&#34;&gt;man flock&lt;/h4&gt;
&lt;p&gt;flock&amp;rsquo;s man page defines it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Cool, translated to modern English that us regular folks use, &lt;code&gt;flock&lt;/code&gt; is a thin
wrapper around the C standard function &lt;code&gt;flock&lt;/code&gt; (see &lt;code&gt;man 2 flock&lt;/code&gt; if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its &lt;strong&gt;file descriptor number&lt;/strong&gt;. Aha! so that was the purpose of the &lt;code&gt;exec 7&amp;lt;&amp;gt; ${job_queue}&lt;/code&gt; calls in the &lt;code&gt;job_pool_worker&lt;/code&gt; function. It would essentially
assign the file descriptor 7 to the fifo &lt;code&gt;job_queue&lt;/code&gt; and afterwards lock it with
&lt;code&gt;flock --exclusive 7&lt;/code&gt;. Cool. This way only one process at a time can read from
the fifo &lt;code&gt;job_queue&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;great-but-how-do-i-use-this&#34;&gt;Great! But how do I use this?&lt;/h2&gt;
&lt;p&gt;It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &amp;ldquo;chad&amp;rdquo; job pool script.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &amp;#34;$i&amp;#34;
done
job_pool_wait
job_pool_shutdown
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hopefully this article was(or will be) helpful to you. From now on, you don&amp;rsquo;t
ever have to write single threaded bash scripts like normies :)&lt;/p&gt;
</content>
</item>
</channel>
</rss>

+ 1
- 0
public/tags/programming/page/1/index.html View File

@ -0,0 +1 @@
<!DOCTYPE html><html><head><title>http://fr1nge.xyz/tags/programming/</title><link rel="canonical" href="http://fr1nge.xyz/tags/programming/"/><meta name="robots" content="noindex"><meta charset="utf-8" /><meta http-equiv="refresh" content="0; url=http://fr1nge.xyz/tags/programming/" /></head></html>

+ 216
- 0
public/tags/scripting/index.html View File

@ -0,0 +1,216 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>scripting :: Fr1nge&#39;s Personal Blog</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="" />
<meta name="keywords" content="Tech, Linux, Programming, Security" />
<meta name="robots" content="noodp" />
<link rel="canonical" href="http://fr1nge.xyz/tags/scripting/" />
<link rel="stylesheet" href="http://fr1nge.xyz/assets/style.css">
<link rel="stylesheet" href="http://fr1nge.xyz/assets/blue.css">
<link rel="apple-touch-icon" href="http://fr1nge.xyz/img/apple-touch-icon-192x192.png">
<link rel="shortcut icon" href="http://fr1nge.xyz/img/favicon/blue.png">
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="yigitcolakoglu.com" />
<meta name="twitter:creator" content="" />
<meta property="og:locale" content="en" />
<meta property="og:type" content="website" />
<meta property="og:title" content="scripting">
<meta property="og:description" content="" />
<meta property="og:url" content="http://fr1nge.xyz/tags/scripting/" />
<meta property="og:site_name" content="Fr1nge&#39;s Personal Blog" />
<meta property="og:image" content="http://fr1nge.xyz/img/favicon/blue.png">
<meta property="og:image:width" content="2048">
<meta property="og:image:height" content="1024">
<link href="/tags/scripting/index.xml" rel="alternate" type="application/rss+xml" title="Fr1nge&#39;s Personal Blog" />
</head>
<body class="blue">
<div class="container center headings--one-size">
<header class="header">
<div class="header__inner">
<div class="header__logo">
<a href="/">
<div class="logo">
fr1nge.xyz
</div>
</a>
</div>
<div class="menu-trigger">menu</div>
</div>
<nav class="menu">
<ul class="menu__inner menu__inner--desktop">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
<ul class="menu__inner menu__inner--mobile">
<li><a href="/about">About</a></li>
<li><a href="/awards">Awards &amp; Certificates</a></li>
<li><a href="/projects">Projects</a></li>
</ul>
</nav>
</header>
<div class="content">
<div class="posts">
<div class="post on-list">
<h1 class="post-title">
<a href="http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/">Supercharge Your Bash Scripts with Multiprocessing</a>
</h1>
<div class="post-meta">
<span class="post-date">
2021-05-05
</span>
<span class="post-author">:: Yigit Colakoglu</span>
</div>
<span class="post-tags">
#<a href="http://fr1nge.xyz/tags/bash/">bash</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/scripting/">scripting</a>&nbsp;
#<a href="http://fr1nge.xyz/tags/programming/">programming</a>&nbsp;
</span>
<div class="post-content">
Bash is a great tool for automating tasks and improving you work flow. However, it is <em><strong>SLOW</strong></em>. Adding multiprocessing to the scripts you write can improve the performance greatly.
</div>
<div>
<a class="read-more button"
href="/posts/supercharge-your-bash-scripts-with-multiprocessing/">Read more →</a>
</div>
</div>
<div class="pagination">
<div class="pagination__buttons">
</div>
</div>
</div>
</div>
<footer class="footer">
<div class="footer__inner">
<div class="copyright copyright--user">
<span>Yigit Colakoglu</span>
<span>:: Theme made by <a href="https://twitter.com/panr">panr</a></span>
</div>
</div>
</footer>
<script src="http://fr1nge.xyz/assets/main.js"></script>
<script src="http://fr1nge.xyz/assets/prism.js"></script>
</div>
</body>
</html>

+ 343
- 0
public/tags/scripting/index.xml View File

@ -0,0 +1,343 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>scripting on Fr1nge&#39;s Personal Blog</title>
<link>http://fr1nge.xyz/tags/scripting/</link>
<description>Recent content in scripting on Fr1nge&#39;s Personal Blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>Yigit Colakoglu</copyright>
<lastBuildDate>Wed, 05 May 2021 17:08:12 +0300</lastBuildDate><atom:link href="http://fr1nge.xyz/tags/scripting/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Supercharge Your Bash Scripts with Multiprocessing</title>
<link>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</link>
<pubDate>Wed, 05 May 2021 17:08:12 +0300</pubDate>
<guid>http://fr1nge.xyz/posts/supercharge-your-bash-scripts-with-multiprocessing/</guid>
<description>Bash is a great tool for automating tasks and improving you work flow. However, it is SLOW. Adding multiprocessing to the scripts you write can improve the performance greatly.
What is multiprocessing? In the simplest terms, multiprocessing is the principle of splitting the computations or jobs that a script has to do and running them on different processes. In even simpler terms however, multiprocessing is the computer science equivalent of hiring more than one worker when you are constructing a building.</description>
<content>&lt;p&gt;Bash is a great tool for automating tasks and improving you work flow. However,
it is &lt;em&gt;&lt;strong&gt;SLOW&lt;/strong&gt;&lt;/em&gt;. Adding multiprocessing to the scripts you write can improve
the performance greatly.&lt;/p&gt;
&lt;h2 id=&#34;what-is-multiprocessing&#34;&gt;What is multiprocessing?&lt;/h2&gt;
&lt;p&gt;In the simplest terms, multiprocessing is the principle of splitting the
computations or jobs that a script has to do and running them on different
processes. In even simpler terms however, multiprocessing is the computer
science equivalent of hiring more than one
worker when you are constructing a building.&lt;/p&gt;
&lt;h3 id=&#34;introducing-&#34;&gt;Introducing &amp;ldquo;&amp;amp;&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;While implementing multiprocessing the sign &lt;code&gt;&amp;amp;&lt;/code&gt; is going to be our greatest
friend. It is an essential sign if you are writing bash scripts and a very
useful tool in general when you are in the terminal. What &lt;code&gt;&amp;amp;&lt;/code&gt; does is that it
makes the command you added it to the end of run in the background and allows
the rest of the script to continue running as the command runs in the
background. One thing to keep in mind is that since it creates a fork of the
process you ran the command on, if you change a variable that the command in the
background uses while it runs, it will not be affected. Here is a simple
example:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
foo=&amp;#34;yeet&amp;#34;
function run_in_background(){
sleep 0.5
echo &amp;#34;The value of foo in the function run_in_background is $foo&amp;#34;
}
run_in_background &amp;amp; # Spawn the function run_in_background in the background
foo=&amp;#34;YEET&amp;#34;
echo &amp;#34;The value of foo changed to $foo.&amp;#34;
wait # wait for the background process to finish
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This should output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;The value of foo changed to YEET.
The value of foo in here is yeet
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, the value of &lt;code&gt;foo&lt;/code&gt; did not change in the background process even though
we changed it in the main function.&lt;/p&gt;
&lt;h2 id=&#34;baby-steps&#34;&gt;Baby steps&amp;hellip;&lt;/h2&gt;
&lt;p&gt;Just like anything related to computer science, there is more than one way of
achieving our goal. We are going to take the easier, less intimidating but less
efficient route first before moving on to the big boy implementation. Let&amp;rsquo;s open up vim and get to scripting!
First of all, let&amp;rsquo;s write a very simple function that allows us to easily test
our implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now that we have something to run in our processes, we now need to spawn several
of them in controlled manner. Controlled being the keyword here. That&amp;rsquo;s because
each system has a maximum number of processes that can be spawned (You can find
that out with the command &lt;code&gt;ulimit -u&lt;/code&gt;). In our case, we want to limit the
processes being ran to the variable &lt;code&gt;num_processes&lt;/code&gt;. Here is the implementation:&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
num_processes=$1
pcount=0
for i in {1..10}; do
((pcount=pcount%num_processes));
((pcount&amp;#43;&amp;#43;==0)) &amp;amp;&amp;amp; wait
tester $i &amp;amp;
done
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;What this loop does is that it takes the number of processes you would like to
spawn as an argument and runs &lt;code&gt;tester&lt;/code&gt; in that many processes. Go ahead and test it out!
You might notice however that the processes are run int batches. And the size of
batches is the &lt;code&gt;num_processes&lt;/code&gt; variable. The reason this happens is because
every time we spawn &lt;code&gt;num_processes&lt;/code&gt; processes, we &lt;code&gt;wait&lt;/code&gt; for all the processes
to end. This implementation is not a problem in itself, there are many cases
where you can use this implementation and it works perfectly fine. However, if
you don&amp;rsquo;t want this to happen, we have to dump this naive approach all together
and improve our tool belt.&lt;/p&gt;
&lt;h2 id=&#34;real-chads-use-job-pools&#34;&gt;Real Chads use Job Pools&lt;/h2&gt;
&lt;p&gt;The solution to the bottleneck that was introduced in our previous approach lies
in using job pools. Job pools are where jobs created by a main process get sent
and wait to get executed. This approach solves our problems because instead of
spawning a new process for every copy and waiting for all the processes to
finish we instead only create a set number of processes(workers) which
continuously pick up jobs from the job pool not waiting for any other process to finish.
Here is the implementation that uses job pools. Brace yourselves, because it is
kind of complicated.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
job_pool_end_of_jobs=&amp;#34;NO_JOB_LEFT&amp;#34;
job_pool_job_queue=/tmp/job_pool_job_queue_$$
job_pool_progress=/tmp/job_pool_progress_$$
job_pool_pool_size=-1
job_pool_nerrors=0
function job_pool_cleanup()
{
rm -f ${job_pool_job_queue}
rm -f ${job_pool_progress}
}
function job_pool_exit_handler()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_worker()
{
local id=$1
local job_queue=$2
local cmd=
local args=
exec 7&amp;lt;&amp;gt; ${job_queue}
while [[ &amp;#34;${cmd}&amp;#34; != &amp;#34;${job_pool_end_of_jobs}&amp;#34; &amp;amp;&amp;amp; -e &amp;#34;${job_queue}&amp;#34; ]]; do
flock --exclusive 7
IFS=$&amp;#39;\v&amp;#39;
read cmd args &amp;lt;${job_queue}
set -- ${args}
unset IFS
flock --unlock 7
if [[ &amp;#34;${cmd}&amp;#34; == &amp;#34;${job_pool_end_of_jobs}&amp;#34; ]]; then
echo &amp;#34;${cmd}&amp;#34; &amp;gt;&amp;amp;7
else
{ ${cmd} &amp;#34;$@&amp;#34; ; }
fi
done
exec 7&amp;gt;&amp;amp;-
}
function job_pool_stop_workers()
{
echo ${job_pool_end_of_jobs} &amp;gt;&amp;gt; ${job_pool_job_queue}
wait
}
function job_pool_start_workers()
{
local job_queue=$1
for ((i=0; i&amp;lt;${job_pool_pool_size}; i&amp;#43;&amp;#43;)); do
job_pool_worker ${i} ${job_queue} &amp;amp;
done
}
function job_pool_init()
{
local pool_size=$1
job_pool_pool_size=${pool_size:=1}
rm -rf ${job_pool_job_queue}
rm -rf ${job_pool_progress}
touch ${job_pool_progress}
mkfifo ${job_pool_job_queue}
echo 0 &amp;gt;${job_pool_progress} &amp;amp;
job_pool_start_workers ${job_pool_job_queue}
}
function job_pool_shutdown()
{
job_pool_stop_workers
job_pool_cleanup
}
function job_pool_run()
{
if [[ &amp;#34;${job_pool_pool_size}&amp;#34; == &amp;#34;-1&amp;#34; ]]; then
job_pool_init
fi
printf &amp;#34;%s\v&amp;#34; &amp;#34;$@&amp;#34; &amp;gt;&amp;gt; ${job_pool_job_queue}
echo &amp;gt;&amp;gt; ${job_pool_job_queue}
}
function job_pool_wait()
{
job_pool_stop_workers
job_pool_start_workers ${job_pool_job_queue}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Ok&amp;hellip; But that the actual fuck is going in here???&lt;/p&gt;
&lt;h3 id=&#34;fifo-and-flock&#34;&gt;fifo and flock&lt;/h3&gt;
&lt;p&gt;In order to understand what this code is doing, you first need to understand two
key commands that we are using, &lt;code&gt;fifo&lt;/code&gt; and &lt;code&gt;flock&lt;/code&gt;. Despite their complicated
names, they are actually quite simple. Let&amp;rsquo;s check their man pages to figure out
their purposes, shall we?&lt;/p&gt;
&lt;h4 id=&#34;man-fifo&#34;&gt;man fifo&lt;/h4&gt;
&lt;p&gt;fifo&amp;rsquo;s man page tells us that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NAME
fifo - first-in first-out special file, named pipe
DESCRIPTION
A FIFO special file (a named pipe) is similar to a pipe, except that
it is accessed as part of the filesystem. It can be opened by multiple
processes for reading or writing. When processes are exchanging data
via the FIFO, the kernel passes all data internally without writing it
to the filesystem. Thus, the FIFO special file has no contents on the
filesystem; the filesystem entry merely serves as a reference point so
that processes can access the pipe using a name in the filesystem.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So put in &lt;strong&gt;very&lt;/strong&gt; simple terms, a fifo is a named pipe that can allows
communication between processes. Using a fifo allows us to loop through the jobs
in the pool without having to delete them manually, because once we read them
with &lt;code&gt;read cmd args &amp;lt; ${job_queue}&lt;/code&gt;, the job is out of the pipe and the next
read outputs the next job in the pool. However the fact that we have multiple
processes introduces one caveat, what if two processes access the pipe at the
same time? They would run the same command and we don&amp;rsquo;t want that. So we resort
to using &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&#34;man-flock&#34;&gt;man flock&lt;/h4&gt;
&lt;p&gt;flock&amp;rsquo;s man page defines it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; SYNOPSIS
flock [options] file|directory command [arguments]
flock [options] file|directory -c command
flock [options] number
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or from
the command line.
The first and second of the above forms wrap the lock around the
execution of a command, in a manner similar to su(1) or newgrp(1).
They lock a specified file or directory, which is created (assuming
appropriate permissions) if it does not already exist. By default, if
the lock cannot be immediately acquired, flock waits until the lock is
available.
The third form uses an open file by its file descriptor number. See
the examples below for how that can be used.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Cool, translated to modern English that us regular folks use, &lt;code&gt;flock&lt;/code&gt; is a thin
wrapper around the C standard function &lt;code&gt;flock&lt;/code&gt; (see &lt;code&gt;man 2 flock&lt;/code&gt; if you are
interested). It is used to manage locks and has several forms. The one we are
interested in is the third one. According to the man page, it uses and open file
by its &lt;strong&gt;file descriptor number&lt;/strong&gt;. Aha! so that was the purpose of the &lt;code&gt;exec 7&amp;lt;&amp;gt; ${job_queue}&lt;/code&gt; calls in the &lt;code&gt;job_pool_worker&lt;/code&gt; function. It would essentially
assign the file descriptor 7 to the fifo &lt;code&gt;job_queue&lt;/code&gt; and afterwards lock it with
&lt;code&gt;flock --exclusive 7&lt;/code&gt;. Cool. This way only one process at a time can read from
the fifo &lt;code&gt;job_queue&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;great-but-how-do-i-use-this&#34;&gt;Great! But how do I use this?&lt;/h2&gt;
&lt;p&gt;It depends on your preference, you can either save this in a file(e.g.
job_pool.sh) and source it in your bash script. Or you can simply paste it
inside an existing bash script. Whatever tickles your fancy. I have also
provided an example that replicates our first implementation. Just paste the
below code under our &amp;ldquo;chad&amp;rdquo; job pool script.&lt;/p&gt;
&lt;div class=&#34;collapsable-code&#34;&gt;
&lt;input id=&#34;1&#34; type=&#34;checkbox&#34; /&gt;
&lt;label for=&#34;1&#34;&gt;
&lt;span class=&#34;collapsable-code__language&#34;&gt;bash&lt;/span&gt;
&lt;span class=&#34;collapsable-code__toggle&#34; data-label-expand=&#34;Show&#34; data-label-collapse=&#34;Hide&#34;&gt;&lt;/span&gt;
&lt;/label&gt;
&lt;pre class=&#34;language-bash&#34; &gt;&lt;code&gt;
function tester(){
# A function that takes an int as a parameter and sleeps
echo &amp;#34;$1&amp;#34;
sleep &amp;#34;$1&amp;#34;
echo &amp;#34;ENDED $1&amp;#34;
}
num_workers=$1
job_pool_init $num_workers
pcount=0
for i in {1..10}; do
job_pool_run tester &amp;#34;$i&amp;#34;
done
job_pool_wait
job_pool_shutdown
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hopefully this article was(or will be) helpful to you. From now on, you don&amp;rsquo;t
ever have to write single threaded bash scripts like normies :)&lt;/p&gt;
</content>
</item>
</channel>
</rss>

+ 1
- 0
public/tags/scripting/page/1/index.html View File

@ -0,0 +1 @@
<!DOCTYPE html><html><head><title>http://fr1nge.xyz/tags/scripting/</title><link rel="canonical" href="http://fr1nge.xyz/tags/scripting/"/><meta name="robots" content="noindex"><meta charset="utf-8" /><meta http-equiv="refresh" content="0; url=http://fr1nge.xyz/tags/scripting/" /></head></html>

BIN
static/images/glasses.png View File

Before After
Width: 980  |  Height: 330  |  Size: 29 KiB

BIN
static/images/supercharge-your-bash-scripts-with-multiprocessing.png View File

Before After
Width: 642  |  Height: 262  |  Size: 9.1 KiB

Loading…
Cancel
Save