Appearance
question:Write a function that takes a large text document as input and returns the k most frequent words in the document. Ensure that your function is efficient in terms of both time and space complexity. The function should be case-insensitive and should disregard punctuation marks such as commas, periods, and exclamation points. Words should be treated as sequences of alphabetic characters only. The function should also handle ties by returning the words in alphabetical order when they have the same frequency count. Additionally, the function should be designed to handle very large files, so ensure that it efficiently processes the text. Your function should return a list of tuples where each tuple contains a word and its frequency count, sorted by the frequency count in descending order. If k is greater than the number of unique words, the function should return all unique words. Duplicate words and their counts should be combined properly. Write appropriate error handling to manage edge cases such as an empty file, files that do not exist, or files with non-alphabetic characters only.
answer:import re from collections import Counter def k_most_frequent_words(text, k): Returns the k most frequent words in the text. Args: text (str): input text document as a string k (int): number of most frequent words to return Returns: List[Tuple[str, int]]: list of tuples containing words and their frequency count if not text or not isinstance(text, str): return [] # Normalize the text text = text.lower() words = re.findall(r'b[a-z]+b', text) # Extract words if not words: return [] # Count the frequency of words word_count = Counter(words) # Get the k most common words and handle ties alphabetically most_common = word_count.most_common() most_common.sort(key=lambda x: (-x[1], x[0])) return most_common[:k]
question:Create a function in Python that takes a list of integers as input and returns a new list containing only the prime numbers from the input list. Implement the function with a nested helper function to check for prime numbers. Ensure the solution is optimized for large input sizes.
answer:def filter_primes(numbers): Returns a list of prime numbers from the input list. :param numbers: List[int] - A list of integers. :return: List[int] - A list of prime numbers from the input list. def is_prime(n): Checks if a number is prime. :param n: int - The number to check. :return: bool - True if the number is prime, False otherwise. if n <= 1: return False if n == 2 or n == 3: return True if n % 2 == 0 or n % 3 == 0: return False i = 5 while i * i <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True return [num for num in numbers if is_prime(num)]
question:Write a function in Python that takes a list of strings and a target string, and returns the list of strings sorted by the number of characters they have in common with the target string. Only consider common characters once (even if they occur multiple times in both strings). If two strings have the same number of common characters, they should retain their original order. Input: - A list of strings: ['apple', 'banana', 'pear', 'peach'] - A target string: 'pale' Output: - The sorted list of strings: ['apple', 'pear', 'peach', 'banana'] The function should be defined as follows: ```python def sort_by_common_chars(strings, target): # Your code here ```
answer:def count_common_chars(s, target_set): Counts the number of unique characters in s that are also in the target_set. return len(set(s) & target_set) def sort_by_common_chars(strings, target): Sorts a list of strings based on the number of unique characters they have in common with the target string. target_set = set(target) return sorted(strings, key=lambda s: count_common_chars(s, target_set), reverse=True)
question:How would you implement a function that takes a matrix of integers and returns the sum of each of its diagonals? The primary diagonal runs from the top left to the bottom right, while the secondary diagonal runs from the top right to the bottom left. Your function should return a tuple containing the sum of the primary diagonal and the sum of the secondary diagonal. Note that the matrix is always square (n x n) and you should handle matrices of various sizes. Illustrate your solution with an example and explain the expected output.
answer:def sum_of_diagonals(matrix): Returns a tuple (sum_primary_diagonal, sum_secondary_diagonal) containing the sum of the primary diagonal and the secondary diagonal of a square matrix. n = len(matrix) sum_primary_diagonal = sum(matrix[i][i] for i in range(n)) sum_secondary_diagonal = sum(matrix[i][n - 1 - i] for i in range(n)) return (sum_primary_diagonal, sum_secondary_diagonal) # Example # Input: # [ # [1, 2, 3], # [4, 5, 6], # [7, 8, 9] # ] # Primary Diagonal: 1 + 5 + 9 = 15 # Secondary Diagonal: 3 + 5 + 7 = 15 # Output: (15, 15)