nfn wrote:
1. If you have a link inside the body that matches the url you want to remove from the cache, that page will be removed.
You can change the invocation of nginx_cache_purge_item in the last line of the script to modify the regex. For example, if you know the item you are searching for is preceded by "foo" and followed by "bar" on the same line, you could change it to:
Code:
nginx_cache_purge_item "foo.*$1.*bar" $2
nfn wrote:
2. If we have a cache with 1000 files with 100 lines each and only 10 fies match our criteria, grep will need to search 99020 lines instead of 2000 if I'm not wrong.
I'm interpreting this to mean you have filenames in a specific format? For example, you want to search files whose names start with "cache-" but no others? In that case, you could modify the get-cache-files function like so:
Code:
function get_cache_files() {
local max_parallel=${3-16}
find $2 -type f -name "cache-*" | \
xargs -P $max_parallel -n 100 grep -l "$1" | sort -u
} # get_cache_filesThis uses find to select the files to be scanned, rather than grep's recursive search.
Instead of kicking off 16 grep processes, each searching a particular directory, it starts up to 16 grep processes, each given a list of 100 filenames to check. If there are more than 1600 files, a new grep process will be started once one of the 16 completes. You may need to tweak the value of $max_parallel or the value of the
xargs -n option to make it run faster. (I deleted the comments since the changes really invalidate them.)