Tuesday, July 23, 2013

Using Glob in PHP

In the previous article "Files and Folder Iterator in PHP", we have discussed how we can iterate through files and folders with functions like dir(), opendir(), readdir() etc. In this section, we would see how glob() function can be used to find and iterate through files and folders under current directory. Check the code snippet below :

<?php
// Change to Folder
chdir('joomla2.5.9');

// Iterate thru all files pattern *.*
foreach( glob("*.*") as $filename )
{
  echo "<br> File Name : $filename, Size ::". filesize($filename);
}
?>


The above code lists all the files matching the pattern "*.*" in the subfolder 'joomla2.5.9'. This pattern can be anything like "*.txt", "*.php", "conf*.*" etc. We can use the path information in the pattern as shown in the code below..

<?php
$patt = "text_files/*a.*";

foreach( glob($patt) as $filename )
{
  echo "<br> File Name : $filename, Size ::". filesize($filename);
}
?>


The above code lists all files whose name ends with 'a' in the folder named "text_files". We we want to list all files and folders then we can use the following pattern :

<?php
foreach( glob("text_files/*") as $filename )
{
    echo "<br> Folder Name :: $filename";
}
?>


If we need to show only list of folders then we can use a flag with glob as shown below :

<?php
foreach( glob("text_files/*", GLOB_ONLYDIR | GLOB_NOSORT  ) as $filename )
{
    echo "<br> Folder Name :: $filename";
}
?>


The flag GLOB_ONLYDIR returns only folders in the path specified. The flag GLOB_NOSORT does not sort the result listing.

We can also use is_dir() to check if this is a folder or not, as shown in the code below :

<?php
foreach( glob("text_files/*") as $filename )
{
  if( is_dir($filename ) )
    echo "<br> Folder Name :: $filename";
}
?>


My current directory is c:\xampp\htdocs, say I would like to see all the folders whose names start with "p" and only 3 letters long in the parent folder c:\xampp. I need to change the code a bit.

<?php
foreach( glob("../p??", GLOB_ONLYDIR | GLOB_NOSORT ) as $filename )
{
    echo "<br> Folder Name :: $filename";
}
?>


The output is ::

Folder Name :: ../php

Next, we build a function which would scan the current directory and show us listing of all the files and sub-folders under it. Check the code below...

<?php
// Set a path
$dir = "text_files";
$space = "-";

// Call the function
get_listing( $dir, "" );

// Function definition
function get_listing($dir,$space)
{
  // Build Path Pattern
  $path = "$dir/*";
 
  /// Loop through each item
  foreach( glob( $path ) as $filename )
  {
    echo "{$space}{$filename}<br>";
  
    // If sub-folder, then call recursively
    if( is_dir($filename) )
    {  get_listing("$filename", $space . "-" );
     
    }
  }
}
?>


The code above is quite self-explanatory. We have created a function called 'get_listing'. Inside it, we find files and folders using a pattern stored in variable "$path". If any item found is a directory ( is_dir() returns TRUE for such entry ), we recursively call the same function to get inside the folder and scan its contents.

Using pattern like "*" does not return special files like .htaccess.

".*" - This pattern can track files like .htaccess. But this also includes special directory entries "." (current directory) and ".." (parent directory).

"n*" - This pattern would return all files and folders whose names start with "n". This is case-sensitive.

"*p" - This patten would return all files and folders whose names ends with "p". This would match folders like "jump", "pump" and file names like "a.php", "c_d.ctp" etc. This would also match only "p".

"*im*" - This would match those files/folders whose names have "im" inside it. However this would match names like "image", "sikkim" or even "im".

"{,.}*" - This pattern would search for * or .* which would include all files, folders, special directory entries like "." and ".." and special files like .htaccess. The curly braces can hold possible values separated by commas. Check another example of globbing pattern with curly braces below::

"sample.{png,gif,tiff,jpeg,jpg}" - This pattern would search for a file name 'sample' with any extension among the values given within curly braces. If the directory has files "sample.png", "sample.tiff", "sample.jpeg", all of them would be listed.

With square brackets, we can provide a range or various tokens to match. Check examples below. Square brackets can't be used to match more than 1 character.

"m[a,u,o]m" - This would match any of "mam", "mom" or "mum".
"[n,j,f]*"  - This would fetch all files/folders with names start with any of "n", "j" or "f".
"b[a-d]d"   - This would match "bad", "bbd", "bcd" and "bdd" because we have provided a range a to d.
"f[ee,oo]l.txt" - This won't match "feel.txt" and "fool.txt". We can not use multi-charactered token inside square brackets. "ee", "oo" are invalid whereas "a", "o" etc single characters are perfectly ok.

The question mark ("?") represents a single character. So, "ba?" would mean "ba" must be followed by a single character only. So, "ban", "bat", "bad", "bar", "bag" would be matched.

"?.*" - This pattern would fetch all files with names (excluding extension) consisting of single letter like "a.php", "j.mov" etc but not "ab.txt" or "pol.php".
"?" would fetch all files/folders with names (including extension) consisting of a single character.
"????.txt" would fetch those .txt files whose names (excluding extension) are composed of 4 characters.

Exclamation mark inside square brackets would imply "logical NOT" or exclusion. A pattern like "a[!x]e" would match "are" but not "axe".

"[!a].txt" - This pattern would match files like "b.txt", "c.txt" but not "a.txt".

In the next article DirectoryIterator in PHP, we discuss on DirectoryIterator Iterator class which provides another way to browse for files and folders.

No comments: