org.apache.commons.csv.package-info Maven / Gradle / Ivy
Show all versions of commons-csv Show documentation
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* Apache Commons CSV Format Support.
*
* CSV are widely used as interfaces to legacy systems or manual data-imports.
* CSV stands for "Comma Separated Values" (or sometimes "Character Separated
* Values"). The CSV data format is defined in
* RFC 4180
* but many dialects exist.
*
* Common to all file dialects is its basic structure: The CSV data-format
* is record oriented, whereas each record starts on a new textual line. A
* record is build of a list of values. Keep in mind that not all records
* must have an equal number of values:
*
* csv := records*
* record := values*
*
*
* The following list contains the CSV aspects the Commons CSV parser supports:
*
* - Separators (for lines)
* - The record separators are hardcoded and cannot be changed. The must be '\r', '\n' or '\r\n'.
*
* - Delimiter (for values)
* - The delimiter for values is freely configurable (default ',').
*
* - Comments
* - Some CSV-dialects support a simple comment syntax. A comment is a record
* which must start with a designated character (the commentStarter). A record
* of this kind is treated as comment and gets removed from the input (default none)
*
* - Encapsulator
* - Two encapsulator characters (default '"') are used to enclose -> complex values.
*
* - Simple values
* - A simple value consist of all characters (except the delimiter) until
* (but not including) the next delimiter or a record-terminator. Optionally
* all surrounding whitespaces of a simple value can be ignored (default: true).
*
* - Complex values
* - Complex values are encapsulated within a pair of the defined encapsulator characters.
* The encapsulator itself must be escaped or doubled when used inside complex values.
* Complex values preserve all kind of formatting (including newlines -> multiline-values)
*
* - Empty line skipping
* - Optionally empty lines in CSV files can be skipped.
* Otherwise, empty lines will return a record with a single empty value.
*
*
* In addition to individually defined dialects, two predefined dialects (strict-csv, and excel-csv)
* can be set directly.
*
* Example usage:
*
* Reader in = new StringReader("a,b,c");
* for (CSVRecord record : CSVFormat.DEFAULT.parse(in)) {
* for (String field : record) {
* System.out.print("\"" + field + "\", ");
* }
* System.out.println();
* }
*
*/
package org.apache.commons.csv;