All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.apache.lucene.store.FSDirectory Maven / Gradle / Ivy

The newest version!
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.lucene.store;


import java.io.FilterOutputStream;
import java.io.IOException;
import java.nio.channels.ClosedChannelException; // javadoc @link
import java.nio.file.DirectoryStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;
import java.nio.file.StandardOpenOption;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.Future;

import org.apache.lucene.util.Constants;
import org.apache.lucene.util.IOUtils;

/**
 * Base class for Directory implementations that store index
 * files in the file system.  
 * 
 * There are currently three core
 * subclasses:
 *
 * 
    * *
  • {@link SimpleFSDirectory} is a straightforward * implementation using Files.newByteChannel. * However, it has poor concurrent performance * (multiple threads will bottleneck) as it * synchronizes when multiple threads read from the * same file. * *
  • {@link NIOFSDirectory} uses java.nio's * FileChannel's positional io when reading to avoid * synchronization when reading from the same file. * Unfortunately, due to a Windows-only Sun * JRE bug this is a poor choice for Windows, but * on all other platforms this is the preferred * choice. Applications using {@link Thread#interrupt()} or * {@link Future#cancel(boolean)} should use * {@code RAFDirectory} instead. See {@link NIOFSDirectory} java doc * for details. * *
  • {@link MMapDirectory} uses memory-mapped IO when * reading. This is a good choice if you have plenty * of virtual memory relative to your index size, eg * if you are running on a 64 bit JRE, or you are * running on a 32 bit JRE but your index sizes are * small enough to fit into the virtual memory space. * Java has currently the limitation of not being able to * unmap files from user code. The files are unmapped, when GC * releases the byte buffers. Due to * * this bug in Sun's JRE, MMapDirectory's {@link IndexInput#close} * is unable to close the underlying OS file handle. Only when * GC finally collects the underlying objects, which could be * quite some time later, will the file handle be closed. * This will consume additional transient disk usage: on Windows, * attempts to delete or overwrite the files will result in an * exception; on other platforms, which typically have a "delete on * last close" semantics, while such operations will succeed, the bytes * are still consuming space on disk. For many applications this * limitation is not a problem (e.g. if you have plenty of disk space, * and you don't rely on overwriting files on Windows) but it's still * an important limitation to be aware of. This class supplies a * (possibly dangerous) workaround mentioned in the bug report, * which may fail on non-Sun JVMs. *
* *

Unfortunately, because of system peculiarities, there is * no single overall best implementation. Therefore, we've * added the {@link #open} method, to allow Lucene to choose * the best FSDirectory implementation given your * environment, and the known limitations of each * implementation. For users who have no reason to prefer a * specific implementation, it's best to simply use {@link * #open}. For all others, you should instantiate the * desired implementation directly. * *

NOTE: Accessing one of the above subclasses either directly or * indirectly from a thread while it's interrupted can close the * underlying channel immediately if at the same time the thread is * blocked on IO. The channel will remain closed and subsequent access * to the index will throw a {@link ClosedChannelException}. * Applications using {@link Thread#interrupt()} or * {@link Future#cancel(boolean)} should use the slower legacy * {@code RAFDirectory} from the {@code misc} Lucene module instead. * *

The locking implementation is by default {@link * NativeFSLockFactory}, but can be changed by * passing in a custom {@link LockFactory} instance. * * @see Directory */ public abstract class FSDirectory extends BaseDirectory { protected final Path directory; // The underlying filesystem directory /** Create a new FSDirectory for the named location (ctor for subclasses). * The directory is created at the named location if it does not yet exist. * *

{@code FSDirectory} resolves the given Path to a canonical / * real path to ensure it can correctly lock the index directory and no other process * can interfere with changing possible symlinks to the index directory inbetween. * If you want to use symlinks and change them dynamically, close all * {@code IndexWriters} and create a new {@code FSDirecory} instance. * @param path the path of the directory * @param lockFactory the lock factory to use, or null for the default * ({@link NativeFSLockFactory}); * @throws IOException if there is a low-level I/O error */ protected FSDirectory(Path path, LockFactory lockFactory) throws IOException { super(lockFactory); // If only read access is permitted, createDirectories fails even if the directory already exists. if (!Files.isDirectory(path)) { Files.createDirectories(path); // create directory, if it doesn't exist } directory = path.toRealPath(); } /** Creates an FSDirectory instance, trying to pick the * best implementation given the current environment. * The directory returned uses the {@link NativeFSLockFactory}. * The directory is created at the named location if it does not yet exist. * *

{@code FSDirectory} resolves the given Path when calling this method to a canonical / * real path to ensure it can correctly lock the index directory and no other process * can interfere with changing possible symlinks to the index directory inbetween. * If you want to use symlinks and change them dynamically, close all * {@code IndexWriters} and create a new {@code FSDirecory} instance. * *

Currently this returns {@link MMapDirectory} for Linux, MacOSX, Solaris, * and Windows 64-bit JREs, {@link NIOFSDirectory} for other * non-Windows JREs, and {@link SimpleFSDirectory} for other * JREs on Windows. It is highly recommended that you consult the * implementation's documentation for your platform before * using this method. * *

NOTE: this method may suddenly change which * implementation is returned from release to release, in * the event that higher performance defaults become * possible; if the precise implementation is important to * your application, please instantiate it directly, * instead. For optimal performance you should consider using * {@link MMapDirectory} on 64 bit JVMs. * *

See above */ public static FSDirectory open(Path path) throws IOException { return open(path, FSLockFactory.getDefault()); } /** Just like {@link #open(Path)}, but allows you to * also specify a custom {@link LockFactory}. */ public static FSDirectory open(Path path, LockFactory lockFactory) throws IOException { if (Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED) { return new MMapDirectory(path, lockFactory); } else if (Constants.WINDOWS) { return new SimpleFSDirectory(path, lockFactory); } else { return new NIOFSDirectory(path, lockFactory); } } /** Lists all files (including subdirectories) in the * directory. * * @throws IOException if there was an I/O error during listing */ public static String[] listAll(Path dir) throws IOException { List entries = new ArrayList<>(); try (DirectoryStream stream = Files.newDirectoryStream(dir)) { for (Path path : stream) { entries.add(path.getFileName().toString()); } } return entries.toArray(new String[entries.size()]); } @Override public String[] listAll() throws IOException { ensureOpen(); return listAll(directory); } /** Returns the length in bytes of a file in the directory. */ @Override public long fileLength(String name) throws IOException { ensureOpen(); return Files.size(directory.resolve(name)); } /** Removes an existing file in the directory. */ @Override public void deleteFile(String name) throws IOException { ensureOpen(); Files.delete(directory.resolve(name)); } /** Creates an IndexOutput for the file with the given name. */ @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { ensureOpen(); return new FSIndexOutput(name); } @Override public void sync(Collection names) throws IOException { ensureOpen(); for (String name : names) { fsync(name); } } @Override public void renameFile(String source, String dest) throws IOException { ensureOpen(); Files.move(directory.resolve(source), directory.resolve(dest), StandardCopyOption.ATOMIC_MOVE); // TODO: should we move directory fsync to a separate 'syncMetadata' method? // for example, to improve listCommits(), IndexFileDeleter could also call that after deleting segments_Ns IOUtils.fsync(directory, true); } /** Closes the store to future operations. */ @Override public synchronized void close() { isOpen = false; } /** @return the underlying filesystem directory */ public Path getDirectory() { ensureOpen(); return directory; } /** For debug output. */ @Override public String toString() { return this.getClass().getSimpleName() + "@" + directory + " lockFactory=" + lockFactory; } final class FSIndexOutput extends OutputStreamIndexOutput { /** * The maximum chunk size is 8192 bytes, because file channel mallocs * a native buffer outside of stack if the write buffer size is larger. */ static final int CHUNK_SIZE = 8192; public FSIndexOutput(String name) throws IOException { super("FSIndexOutput(path=\"" + directory.resolve(name) + "\")", new FilterOutputStream(Files.newOutputStream(directory.resolve(name), StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE)) { // This implementation ensures, that we never write more than CHUNK_SIZE bytes: @Override public void write(byte[] b, int offset, int length) throws IOException { while (length > 0) { final int chunk = Math.min(length, CHUNK_SIZE); out.write(b, offset, chunk); length -= chunk; offset += chunk; } } }, CHUNK_SIZE); } } protected void fsync(String name) throws IOException { IOUtils.fsync(directory.resolve(name), false); } }





© 2015 - 2025 Weber Informatics LLC | Privacy Policy