複数時期の横断成果の入出力に対応した自作rivファイルCreator

2024年7月29日 03:47

1.経緯

　とある河川における過去の複数時期における横断測量データをiRIC上で読み込む必要があった．早速データを確認してみると，ある時期以前と以降で形式がやや異なることが分かった．調べてみると国土交通省が平成20(2008)年に河川定期縦横断データの形式を定めたガイドラインを策定しており，それ以降の測量成果はこの形式に従って作成されている．しかしそれ以前の測量成果は，統一の形式で作成されていなかったようである．

https://www.mlit.go.jp/river/shishin_guideline/kasen/gis/pdf_docs/juoudan/guideline0805.pdf

　ところで横断測量データをiRICに入力する方法は，①rivデータを作成し入力，②国土交通省形式のデータをそのまま入力，の2通りである．

　平成20年以前の旧形式はそのまま入力できないので困った．しかし幸いにも私が扱う旧形式データはすべての横断面が1つのcsvファイルにまとまっているというだけで，これを断面ごとにそれぞれ別のcsvファイルとして切り出す作業をするだけで済んだ．これで無事に②の方法で入力できると思ったが，なぜか旧形式のデータは入力できなかった．（恐らくエンコーディングの問題？）
　そこで①の方法を採用し，iRICのHPで公開されているExcelマクロプログラムである"rivファイルCreator"を用いてrivファイルを作成した．（これはうまくいった）

　しかし本プログラムでは複数時期の横断成果を同時にriv化するには，その都度入力と出力を設定しriv化する必要があるため，多数の横断成果を扱うにはやや面倒であると感じた．
　マクロのコードをいじろうと考えたが，パスワードがかかっておりコードを確認できない．（そんなことできるんだ）そこでpythonで自作することにした

2.作成したコード

　作成したと言っても，大部分を生成AIに書かせた．

＜ライブラリのインポート＞
エンコーディングをshift-JISとして扱い，それでエラーが起きた場合にchardetを使用する．"pip install chardet"でインストール

import os
import pandas as pd
import chardet

＜入出力パスの設定＞
ここで，kpxy:：距離標のXY座標，LH：横断測量成果，LHは同一のフォルダにcsvで格納されている必要がある．

#入出力パスの設定
kpxy_path = r"N:\kpxy.csv" #距離標のxy座標を示すcsvファイル，左列からKP,LX,LY,RX,RYの順で
lh_dirs = [
    r"N:\LH1",
    r"N:\LH2",
    r"N:\LH3"
    #複数のLHフォルダの入力に対応
]
lh_prefix = ''  #指定した文字列からファイル名が始まるcsvファイルのみを対象
output_dir = r'N:\output' #出力フォルダの指定，入力したLHフォルダと同名のファイル名でrivとして出力

＜本体部分＞

class RivFileCreater:
    def __init__(self, kpxy_path, lh_dirs, lh_prefix, output_dir):
        self.kpxy_data = pd.read_csv(kpxy_path)
        self.lh_dirs = lh_dirs
        self.lh_prefix = lh_prefix
        self.output_dir = output_dir

    def create_riv_files(self):
        for lh_dir in self.lh_dirs:
            riv_content = "#survey\n"
            # Add the kpxy data excluding the header
            for index, row in self.kpxy_data.iterrows():
                section_name = row['KP']
                section_name = self.format_section_name(section_name)
                riv_content += f"{section_name} {row['LY']} {row['LX']} {row['RY']} {row['RX']}\n"

            riv_content += "#x-section\n"

            matched_sections = []
            unmatched_sections = []

            # Process LH files in the current directory
            for lh_file in os.listdir(lh_dir):
                if lh_file.lower().endswith(".csv") and lh_file.startswith(self.lh_prefix):
                    lh_path = os.path.join(lh_dir, lh_file)
                    lh_data = self.read_lh_file(lh_path)
                    if lh_data is not None:
                        section_name = self.extract_section_name(lh_data)
                        if section_name is not None:
                            kpxy_row = self.kpxy_data[self.kpxy_data['KP'] == section_name]
                            if not kpxy_row.empty:
                                riv_content += self.generate_riv_content(section_name, lh_data)
                                matched_sections.append(section_name)
                            else:
                                unmatched_sections.append(section_name)

            # Define the output file path
            output_file_name = f"{os.path.basename(lh_dir)}.riv"
            output_path = os.path.join(self.output_dir, output_file_name)

            # Write the riv content to a file
            with open(output_path, 'w', encoding='utf-8') as file:
                file.write(riv_content)

            # Output results
            if matched_sections and not unmatched_sections:
                print(f"All sections matched for {lh_dir} and the .riv file was created successfully.")
            elif matched_sections and unmatched_sections:
                print(f"The following sections matched for {lh_dir} and were included in the .riv file:")
                print(", ".join(map(str, matched_sections)))
                print(f"The following sections did not match for {lh_dir} and were not included:")
                print(", ".join(map(str, unmatched_sections)))
            else:
                print(f"No sections matched for {lh_dir}. The .riv file was not created.")

    def read_lh_file(self, lh_path):
        try:
            # Try reading with shift_jis encoding first
            return pd.read_csv(lh_path, header=None, encoding='shift_jis', on_bad_lines='skip')
        except UnicodeDecodeError:
            # If shift_jis fails, use chardet to detect encoding
            with open(lh_path, 'rb') as file:
                raw_data = file.read()
                result = chardet.detect(raw_data)
                encoding = result['encoding']
                try:
                    return pd.read_csv(lh_path, header=None, encoding=encoding, on_bad_lines='skip')
                except Exception as e:
                    return None

    def extract_section_name(self, lh_data):
        # Attempt to extract section name from the first cell
        section_name = lh_data.iloc[0, 0]
        return self.format_section_name(section_name)

    def format_section_name(self, section_name):
        try:
            # Try converting to float first
            section_name = float(section_name)
            # If it's an integer, convert to int
            if section_name.is_integer():
                section_name = int(section_name)
        except ValueError:
            # If it cannot be converted to float, keep it as is
            pass
        return section_name

    def generate_riv_content(self, section_name, lh_data):
        content = ""
        # First line with section name and number of points
        num_points = len(lh_data) - 1  # Exclude the first row
        content += f"{section_name} {num_points}\n"

        # Add distance and elevation data, 5 sets per line
        count = 0
        for index, row in lh_data.iterrows():
            if index > 0:  # Skip the first row
                distance = row[1]
                elevation = row[2]
                if not pd.isna(distance) and not pd.isna(elevation):
                    content += f"{distance} {elevation} "
                    count += 1
                    if count % 5 == 0:
                        content = content.strip() + "\n"

        if count % 5 != 0:
            content = content.strip() + "\n"

        return content
  
creator = RivFileCreater(kpxy_path, lh_dirs, lh_prefix, output_dir)
creator.create_riv_files()

3.感想

　本コードでは複数の横断成果を同時にriv化するうえで有用だが，コード作成に時間をかけたため，最初から本家を使った方が早かった．ただ自分にとって使いやすい道具を作るのは気分が良い．（生成AIに作らせただけだが）

　本コードに関する誤りの指摘等は歓迎しますが，本コードを使用したことによる損失への補填は致しかねます．

この記事が気に入ったらサポートをしてみませんか？