欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

php删除文本文件中重复行的方法

程序员文章站 2022-05-22 13:11:39
本文实例讲述了php删除文本文件中重复行的方法。分享给大家供大家参考。具体分析如下: 这个php函数用来删除文件中的重复行,还可以指定是否忽略大小写,和指定换行符...

本文实例讲述了php删除文本文件中重复行的方法。分享给大家供大家参考。具体分析如下:

这个php函数用来删除文件中的重复行,还可以指定是否忽略大小写,和指定换行符

/**
 * removeduplicatedlines
 * this function removes all duplicated lines of the given text file.
 *
 * @param   string
 * @param   bool
 * @return  string
 */
function removeduplicatedlines($filepath, $ignorecase=false, $newline="\n"){
  if (!file_exists($filepath)){
    $errormsg = 'removeduplicatedlines error: ';
    $errormsg .= 'the given file ' . $filepath . ' does not exist!';
    die($errormsg);
  }
  $content = file_get_contents($filepath);
  $content = removeduplicatedlinesbystring($content, $ignorecase, $newline);
  // is the file writeable?
  if (!is_writeable($filepath)){
    $errormsg = 'removeduplicatedlines error: ';
    $errormsg .= 'the given file ' . $filepath . ' is not writeable!';  
    die($errormsg);
  }
  // write the new file
  $fileresource = fopen($filepath, 'w+');   
  fwrite($fileresource, $content);    
  fclose($fileresource);  
}
 
/**
 * removeduplicatedlinesbystring
 * this function removes all duplicated lines of the given string.
 *
 * @param   string
 * @param   bool
 * @return  string
 */
function removeduplicatedlinesbystring($lines, $ignorecase=false, $newline="\n"){
  if (is_array($lines))
    $lines = implode($newline, $lines);
  $lines = explode($newline, $lines);
  $linearray = array();
  $duplicates = 0;
  // go trough all lines of the given file
  for ($line=0; $line < count($lines); $line++){
    // trim whitespace for the current line
    $currentline = trim($lines[$line]);
    // skip empty lines
    if ($currentline == '')
      continue;
    // use the line contents as array key
    $linekey = $currentline;
    if ($ignorecase)
      $linekey = strtolower($linekey);
    // check if the array key already exists,
    // if not add it otherwise increase the counter
    if (!isset($linearray[$linekey]))
      $linearray[$linekey] = $currentline;    
    else        
      $duplicates++;
  }
  // sort the array
  asort($linearray);
  // return how many lines got removed
  return implode($newline, array_values($linearray));  
}

使用范例:

// example 1
// removes all duplicated lines of the file definied in the first parameter.
$removedlinescount = removeduplicatedlines('test.txt');
print "removed $removedlinescount duplicate lines from the test.txt file.";
// example 2 (ignore case)
// same as above, just ignores the line case.
removeduplicatedlines('test.txt', true);
// example 3 (custom new line character)
// by using the 3rd parameter you can define which character
// should be used as new line indicator. in this case
// the example file looks like 'foo;bar;foo;foo' and will
// be replaced with 'foo;bar' 
removeduplicatedlines('test.txt', false, ';');

希望本文所述对大家的php程序设计有所帮助。